[jira] [Resolved] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Dongjoon Hyun (JIRA) Thu, 07 Feb 2019 01:27:54 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun resolved SPARK-26082.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 3.0.0
                   2.4.1
                   2.3.4

This is resolved via https://github.com/apache/spark/pull/23734

> Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
> -----------------------------------------------------------------------
>
>                 Key: SPARK-26082
>                 URL: https://issues.apache.org/jira/browse/SPARK-26082
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 
> 2.3.1, 2.3.2
>            Reporter: Martin Loncaric
>            Priority: Major
>             Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> Currently in 
> [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]:
> {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs 
> (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the 
> Mesos Fetcher Cache
> {quote}
> Currently in {{MesosClusterScheduler.scala}} (which passes parameter to 
> driver):
> {{private val useFetchCache = 
> conf.getBoolean("spark.mesos.fetchCache.enable", false)}}
> Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos 
> caching parameter to executors):
> {{private val useFetcherCache = 
> conf.getBoolean("spark.mesos.fetcherCache.enable", false)}}
> This naming discrepancy dates back to version 2.0.0 
> ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%[email protected]%3E]).
> This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, 
> the Mesos cache will be used only for executors, and not for drivers.
> IMPACT:
> Not caching these driver files (typically including at least spark binaries, 
> custom jar, and additional dependencies) adds considerable overhead network 
> traffic and startup time when frequently running spark Applications on a 
> Mesos cluster. Additionally, since extracted files like 
> {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox 
> with the cache off (rather than extracted directly without an extra copy), 
> this can considerably increase disk usage. Users CAN currently workaround by 
> specifying the {{spark.mesos.fetchCache.enable}} option, but this should at 
> least be specified in the documentation.
> SUGGESTED FIX:
> Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - 
> 2.4, and update {{MesosClusterScheduler.scala}} to use 
> {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line 
> change).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Reply via email to