Re: Mesos fine-grained multi-user mode failed to allocate tasks

David Greenberg Wed, 13 Jul 2016 16:34:56 -0700

You could also check out Cook from twosigma. It's open source on github,
and offers true preemptive multitenancy with spark on Mesos, by
intermediating the spark drivers to optimize the cluster overall.
On Wed, Jul 13, 2016 at 3:41 PM Rahul Palamuttam <rahulpala...@gmail.com>
wrote:


> Thank you Joseph.
>
> We'll try to explore coarse grained mode with dynamic allocation.
>
> On Wed, Jul 13, 2016 at 12:28 PM, Joseph Wu <jos...@mesosphere.io> wrote:
>
>> Looks like you're running Spark in "fine-grained" mode (deprecated).
>>
>> (The Spark website appears to be down right now, so here's the doc on
>> Github:)
>>
>> https://github.com/apache/spark/blob/master/docs/running-on-mesos.md#fine-grained-deprecated
>>
>> Note that while Spark tasks in fine-grained will relinquish cores as they
>>> terminate, they will not relinquish memory, as the JVM does not give memory
>>> back to the Operating System. Neither will executors terminate when they're
>>> idle.
>>
>>
>> You can follow some of the recommendations Spark has in that document for
>> sharing resources, when using Mesos.
>>
>> On Wed, Jul 13, 2016 at 12:12 PM, Rahul Palamuttam <
>> rahulpala...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Our team has been tackling multi-tenancy related issues with Mesos for
>>> quite some time.
>>>
>>> The problem is that tasks aren't being allocated properly when multiple
>>> applications are trying to launch a job. If we launch application A, and
>>> soon after application B, application B waits pretty much till the
>>> completion of application A for tasks to even be staged in Mesos. Right now
>>> these applications are the spark-shell or the zeppelin interpreter.
>>>
>>> Even a simple sc.parallelize(1 to 10000000).reduce(+) launched in two
>>> different spark-shells results in the issue we're observing. One of the
>>> counts waits (in fact we don't even see the tasks being staged in mesos)
>>> until the current one finishes. This is the biggest issue we have been
>>> experience and any help or advice would be greatly appreciated. We want to
>>> be able to launch multiple jobs concurrently on our cluster and share
>>> resources appropriately.
>>>
>>> Another issue we see is that the java heap-space on the mesos executor
>>> backend process is not being cleaned up once a job has finished in the
>>> spark shell.
>>> I've attached a png file of the jvisualvm output showing that the
>>> heapspace is still allocated on a worker node. If I force the GC from
>>> jvisualvm then nearly all of that memory gets cleaned up. This may be
>>> because the spark-shell is still active - but if we've waited long enough
>>> why doesn't GC just clean up the space? However, even after forcing GC the
>>> mesos UI shows us that these resources are still being used.
>>> There should be a way to bring down the memory utilization of the
>>> executors once a task is finished. It shouldn't continue to have that
>>> memory allocated, even if a spark-shell is active on the driver.
>>>
>>> We have mesos configured to use fine-grained mode.
>>> The following are parameters we have set in our spark-defaults.conf file.
>>>
>>>
>>> spark.eventLog.enabled           true
>>> spark.eventLog.dir               hdfs://frontend-system:8090/directory
>>> <http://scispark1.jpl.nasa.gov:8090/directory>
>>> spark.local.dir                    /data/cluster-local/SPARK_TMP
>>>
>>> spark.executor.memory            50g
>>>
>>> spark.externalBlockStore.baseDir /data/cluster-local/SPARK_TMP
>>> spark.executor.extraJavaOptions  -XX:MaxTenuringThreshold=0
>>> spark.executor.uri      hdfs://frontend-system
>>> :8090/spark/spark-1.6.0-bin-hadoop2.4.tgz
>>> <http://scispark1.jpl.nasa.gov:8090/spark/spark-1.6.0-bin-hadoop2.4.tgz>
>>> spark.mesos.coarse      false
>>>
>>> Please let me know if there are any questions about our configuration.
>>> Any advice or experience the mesos community can share pertaining to
>>> issues with fine-grained mode would be greatly appreciated!
>>>
>>> I would also like to sincerely apologize for my previous test message on
>>> the mailing list.
>>> It was an ill-conceived idea since we are in a bit of a time crunch and
>>> I needed to get this message posted. I forgot I needed to send reply on to
>>> the user-subscribers email for me to be listed, resulting in message not
>>> sent emails. I will not do that again.
>>>
>>> Thanks,
>>>
>>> Rahul Palamuttam
>>>
>>
>>
>

Re: Mesos fine-grained multi-user mode failed to allocate tasks

Reply via email to