Re: Fair scheduler pool leak

Imran Rashid Mon, 09 Apr 2018 09:28:45 -0700

If I understand what you're trying to do correctly, I think you really just
want one pool, but you want to change the mode *within* the pool to be FAIR
as well


https://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties

you'd still need to change the conf file to set up that pool, but that
should be fairly straight-forward?  Another approach to what you're asking
might be to expose the scheduler configuration as command line confs as
well, which seems reasonable and simple.

On Sat, Apr 7, 2018 at 5:55 PM, Matthias Boehm <mboe...@gmail.com> wrote:

> well, the point was "in a programmatic way without the need for
> additional configuration files which is a hassle for a library" -
> anyway, I appreciate your comments.
>
> Regards,
> Matthias
>
> On Sat, Apr 7, 2018 at 3:43 PM, Mark Hamstra <m...@clearstorydata.com>
> wrote:
> >> Providing a way to set the mode of the default scheduler would be
> awesome.
> >
> >
> > That's trivial: Just use the pool configuration XML file and define a
> pool
> > named "default" with the characteristics that you want (including
> > schedulingMode FAIR).
> >
> > You only get the default construction of the pool named "default" is you
> > don't define your own "default".
> >
> > On Sat, Apr 7, 2018 at 2:32 PM, Matthias Boehm <mboe...@gmail.com>
> wrote:
> >>
> >> No, these pools are not created per job but per parfor worker and
> >> thus, used to execute many jobs. For all scripts with a single
> >> top-level parfor this is equivalent to static initialization. However,
> >> yes we create these pools dynamically on demand to avoid unnecessary
> >> initialization and handle scenarios of nested parfor.
> >>
> >> At the end of the day, we just want to configure fair scheduling in a
> >> programmatic way without the need for additional configuration files
> >> which is a hassle for a library that is meant to work out-of-the-box.
> >> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick
> >> because we end up with a single default fair scheduler pool in FIFO
> >> mode, which is equivalent to FIFO. Providing a way to set the mode of
> >> the default scheduler would be awesome.
> >>
> >> Regarding why fair scheduling showed generally better performance for
> >> out-of-core datasets, I don't have a good answer. My guess was
> >> isolated job scheduling and better locality of in-memory partitions.
> >>
> >> Regards,
> >> Matthias
> >>
> >> On Sat, Apr 7, 2018 at 8:50 AM, Mark Hamstra <m...@clearstorydata.com>
> >> wrote:
> >> > Sorry, but I'm still not understanding this use case. Are you somehow
> >> > creating additional scheduling pools dynamically as Jobs execute? If
> so,
> >> > that is a very unusual thing to do. Scheduling pools are intended to
> be
> >> > statically configured -- initialized, living and dying with the
> >> > Application.
> >> >
> >> > On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Thanks for the clarification Imran - that helped. I was mistakenly
> >> >> assuming that these pools are removed via weak references, as the
> >> >> ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For
> >> >> the time being, we'll just work around it, but I'll file a
> >> >> nice-to-have improvement JIRA. Also, you're right, we see indeed
> these
> >> >> warnings but they're usually hidden when running with ERROR or INFO
> >> >> (due to overwhelming output) log levels.
> >> >>
> >> >> Just to give the context: We use these scheduler pools in SystemML's
> >> >> parallel for loop construct (parfor), which allows combining data-
> and
> >> >> task-parallel computation. If the data fits into the remote memory
> >> >> budget, the optimizer may decide to execute the entire loop as a
> >> >> single spark job (with groups of iterations mapped to spark tasks).
> If
> >> >> the data is too large and non-partitionable, the parfor loop is
> >> >> executed as a multi-threaded operator in the driver and each worker
> >> >> might spawn several data-parallel spark jobs in the context of the
> >> >> worker's scheduler pool, for operations that don't fit into the
> >> >> driver.
> >> >>
> >> >> We decided to use these fair scheduler pools (w/ fair scheduling
> >> >> across pools, FIFO per pool) instead of the default FIFO scheduler
> >> >> because it gave us better and more robust performance back in the
> >> >> Spark 1.x line. This was especially true for concurrent jobs over
> >> >> shared input data (e.g., for hyper parameter tuning) and when the
> data
> >> >> size exceeded aggregate memory. The only downside was that we had to
> >> >> guard against scenarios where concurrently jobs would lazily pull a
> >> >> shared RDD into cache because that lead to thread contention on the
> >> >> executors' block managers and spurious replicated in-memory
> >> >> partitions.
> >> >>
> >> >> Regards,
> >> >> Matthias
> >> >>
> >> >> On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com>
> >> >> wrote:
> >> >> > Hi Matthias,
> >> >> >
> >> >> > This doeesn't look possible now.  It may be worth filing an
> >> >> > improvement
> >> >> > jira
> >> >> > for.
> >> >> >
> >> >> > But I'm trying to understand what you're trying to do a little
> >> >> > better.
> >> >> > So
> >> >> > you intentionally have each thread create a new unique pool when
> its
> >> >> > submits
> >> >> > a job?  So that pool will just get the default pool configuration,
> >> >> > and
> >> >> > you
> >> >> > will see lots of these messages in your logs?
> >> >> >
> >> >> >
> >> >> >
> >> >> > https://github.com/apache/spark/blob/
> 6ade5cbb498f6c6ea38779b97f2325d5cf5013f2/core/src/main/
> scala/org/apache/spark/scheduler/SchedulableBuilder.scala#L196-L200
> >> >> >
> >> >> > What is the use case for creating pools this way?
> >> >> >
> >> >> > Also if I understand correctly, it doesn't even matter if the
> thread
> >> >> > dies --
> >> >> > that pool will still stay around, as the rootPool will retain a
> >> >> > reference to
> >> >> > its (the pools aren't really actually tied to specific threads).
> >> >> >
> >> >> > Imran
> >> >> >
> >> >> > On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboe...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi all,
> >> >> >>
> >> >> >> for concurrent Spark jobs spawned from the driver, we use Spark's
> >> >> >> fair
> >> >> >> scheduler pools, which are set and unset in a thread-local manner
> by
> >> >> >> each worker thread. Typically (for rather long jobs), this works
> >> >> >> very
> >> >> >> well. Unfortunately, in an application with lots of very short
> >> >> >> parallel sections, we see 1000s of these pools remaining in the
> >> >> >> Spark
> >> >> >> UI, which indicates some kind of leak. Each worker cleans up its
> >> >> >> local
> >> >> >> property by setting it to null, but not all pools are properly
> >> >> >> removed. I've checked and reproduced this behavior with Spark
> >> >> >> 2.1-2.3.
> >> >> >>
> >> >> >> Now my question: Is there a way to explicitly remove these pools,
> >> >> >> either globally, or locally while the thread is still alive?
> >> >> >>
> >> >> >> Regards,
> >> >> >> Matthias
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------------------------------------
> ---------
> >> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >> >>
> >> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: Fair scheduler pool leak

Reply via email to