Thank you Joseph.

We'll try to explore coarse grained mode with dynamic allocation.

On Wed, Jul 13, 2016 at 12:28 PM, Joseph Wu <[email protected]> wrote:

> Looks like you're running Spark in "fine-grained" mode (deprecated).
>
> (The Spark website appears to be down right now, so here's the doc on
> Github:)
>
> https://github.com/apache/spark/blob/master/docs/running-on-mesos.md#fine-grained-deprecated
>
> Note that while Spark tasks in fine-grained will relinquish cores as they
>> terminate, they will not relinquish memory, as the JVM does not give memory
>> back to the Operating System. Neither will executors terminate when they're
>> idle.
>
>
> You can follow some of the recommendations Spark has in that document for
> sharing resources, when using Mesos.
>
> On Wed, Jul 13, 2016 at 12:12 PM, Rahul Palamuttam <[email protected]
> > wrote:
>
>> Hi,
>>
>> Our team has been tackling multi-tenancy related issues with Mesos for
>> quite some time.
>>
>> The problem is that tasks aren't being allocated properly when multiple
>> applications are trying to launch a job. If we launch application A, and
>> soon after application B, application B waits pretty much till the
>> completion of application A for tasks to even be staged in Mesos. Right now
>> these applications are the spark-shell or the zeppelin interpreter.
>>
>> Even a simple sc.parallelize(1 to 10000000).reduce(+) launched in two
>> different spark-shells results in the issue we're observing. One of the
>> counts waits (in fact we don't even see the tasks being staged in mesos)
>> until the current one finishes. This is the biggest issue we have been
>> experience and any help or advice would be greatly appreciated. We want to
>> be able to launch multiple jobs concurrently on our cluster and share
>> resources appropriately.
>>
>> Another issue we see is that the java heap-space on the mesos executor
>> backend process is not being cleaned up once a job has finished in the
>> spark shell.
>> I've attached a png file of the jvisualvm output showing that the
>> heapspace is still allocated on a worker node. If I force the GC from
>> jvisualvm then nearly all of that memory gets cleaned up. This may be
>> because the spark-shell is still active - but if we've waited long enough
>> why doesn't GC just clean up the space? However, even after forcing GC the
>> mesos UI shows us that these resources are still being used.
>> There should be a way to bring down the memory utilization of the
>> executors once a task is finished. It shouldn't continue to have that
>> memory allocated, even if a spark-shell is active on the driver.
>>
>> We have mesos configured to use fine-grained mode.
>> The following are parameters we have set in our spark-defaults.conf file.
>>
>>
>> spark.eventLog.enabled           true
>> spark.eventLog.dir               hdfs://frontend-system:8090/directory
>> <http://scispark1.jpl.nasa.gov:8090/directory>
>> spark.local.dir                    /data/cluster-local/SPARK_TMP
>>
>> spark.executor.memory            50g
>>
>> spark.externalBlockStore.baseDir /data/cluster-local/SPARK_TMP
>> spark.executor.extraJavaOptions  -XX:MaxTenuringThreshold=0
>> spark.executor.uri      hdfs://frontend-system
>> :8090/spark/spark-1.6.0-bin-hadoop2.4.tgz
>> <http://scispark1.jpl.nasa.gov:8090/spark/spark-1.6.0-bin-hadoop2.4.tgz>
>> spark.mesos.coarse      false
>>
>> Please let me know if there are any questions about our configuration.
>> Any advice or experience the mesos community can share pertaining to
>> issues with fine-grained mode would be greatly appreciated!
>>
>> I would also like to sincerely apologize for my previous test message on
>> the mailing list.
>> It was an ill-conceived idea since we are in a bit of a time crunch and I
>> needed to get this message posted. I forgot I needed to send reply on to
>> the user-subscribers email for me to be listed, resulting in message not
>> sent emails. I will not do that again.
>>
>> Thanks,
>>
>> Rahul Palamuttam
>>
>
>

Reply via email to