Mesos fine-grained multi-user mode failed to allocate tasks

Rahul Palamuttam Wed, 13 Jul 2016 12:13:02 -0700

Hi,

Our team has been tackling multi-tenancy related issues with Mesos for
quite some time.


The problem is that tasks aren't being allocated properly when multiple
applications are trying to launch a job. If we launch application A, and
soon after application B, application B waits pretty much till the
completion of application A for tasks to even be staged in Mesos. Right now
these applications are the spark-shell or the zeppelin interpreter.

Even a simple sc.parallelize(1 to 10000000).reduce(+) launched in two
different spark-shells results in the issue we're observing. One of the
counts waits (in fact we don't even see the tasks being staged in mesos)
until the current one finishes. This is the biggest issue we have been
experience and any help or advice would be greatly appreciated. We want to
be able to launch multiple jobs concurrently on our cluster and share
resources appropriately.

Another issue we see is that the java heap-space on the mesos executor
backend process is not being cleaned up once a job has finished in the
spark shell.
I've attached a png file of the jvisualvm output showing that the heapspace
is still allocated on a worker node. If I force the GC from jvisualvm then
nearly all of that memory gets cleaned up. This may be because the
spark-shell is still active - but if we've waited long enough why doesn't
GC just clean up the space? However, even after forcing GC the mesos UI
shows us that these resources are still being used.
There should be a way to bring down the memory utilization of the executors
once a task is finished. It shouldn't continue to have that memory
allocated, even if a spark-shell is active on the driver.

We have mesos configured to use fine-grained mode.
The following are parameters we have set in our spark-defaults.conf file.


spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://frontend-system:8090/directory
<http://scispark1.jpl.nasa.gov:8090/directory>
spark.local.dir                    /data/cluster-local/SPARK_TMP

spark.executor.memory            50g

spark.externalBlockStore.baseDir /data/cluster-local/SPARK_TMP
spark.executor.extraJavaOptions  -XX:MaxTenuringThreshold=0
spark.executor.uri      hdfs://frontend-system
:8090/spark/spark-1.6.0-bin-hadoop2.4.tgz
<http://scispark1.jpl.nasa.gov:8090/spark/spark-1.6.0-bin-hadoop2.4.tgz>
spark.mesos.coarse      false

Please let me know if there are any questions about our configuration.
Any advice or experience the mesos community can share pertaining to issues
with fine-grained mode would be greatly appreciated!

I would also like to sincerely apologize for my previous test message on
the mailing list.
It was an ill-conceived idea since we are in a bit of a time crunch and I
needed to get this message posted. I forgot I needed to send reply on to
the user-subscribers email for me to be listed, resulting in message not
sent emails. I will not do that again.

Thanks,

Rahul Palamuttam

Mesos fine-grained multi-user mode failed to allocate tasks

Reply via email to