[jira] [Commented] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
[ https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762552#comment-16762552 ] Dongjoon Hyun commented on SPARK-26082: --- Although this is not a blocker, cc [~maropu]. > Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler > --- > > Key: SPARK-26082 > URL: https://issues.apache.org/jira/browse/SPARK-26082 > Project: Spark > Issue Type: Bug > Components: Mesos >Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, > 2.3.1, 2.3.2 >Reporter: Martin Loncaric >Assignee: Martin Loncaric >Priority: Major > Fix For: 2.3.4, 2.4.1, 3.0.0 > > > Currently in > [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]: > {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs > (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the > Mesos Fetcher Cache > {quote} > Currently in {{MesosClusterScheduler.scala}} (which passes parameter to > driver): > {{private val useFetchCache = > conf.getBoolean("spark.mesos.fetchCache.enable", false)}} > Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos > caching parameter to executors): > {{private val useFetcherCache = > conf.getBoolean("spark.mesos.fetcherCache.enable", false)}} > This naming discrepancy dates back to version 2.0.0 > ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]). > This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, > the Mesos cache will be used only for executors, and not for drivers. > IMPACT: > Not caching these driver files (typically including at least spark binaries, > custom jar, and additional dependencies) adds considerable overhead network > traffic and startup time when frequently running spark Applications on a > Mesos cluster. Additionally, since extracted files like > {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox > with the cache off (rather than extracted directly without an extra copy), > this can considerably increase disk usage. Users CAN currently workaround by > specifying the {{spark.mesos.fetchCache.enable}} option, but this should at > least be specified in the documentation. > SUGGESTED FIX: > Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - > 2.4, and update {{MesosClusterScheduler.scala}} to use > {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line > change). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
[ https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762511#comment-16762511 ] Dongjoon Hyun commented on SPARK-26082: --- Thank you, [~mwlon]. You are added to Apache Spark contributor group. > Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler > --- > > Key: SPARK-26082 > URL: https://issues.apache.org/jira/browse/SPARK-26082 > Project: Spark > Issue Type: Bug > Components: Mesos >Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, > 2.3.1, 2.3.2 >Reporter: Martin Loncaric >Assignee: Martin Loncaric >Priority: Major > Fix For: 2.3.4, 2.4.1, 3.0.0 > > > Currently in > [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]: > {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs > (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the > Mesos Fetcher Cache > {quote} > Currently in {{MesosClusterScheduler.scala}} (which passes parameter to > driver): > {{private val useFetchCache = > conf.getBoolean("spark.mesos.fetchCache.enable", false)}} > Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos > caching parameter to executors): > {{private val useFetcherCache = > conf.getBoolean("spark.mesos.fetcherCache.enable", false)}} > This naming discrepancy dates back to version 2.0.0 > ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]). > This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, > the Mesos cache will be used only for executors, and not for drivers. > IMPACT: > Not caching these driver files (typically including at least spark binaries, > custom jar, and additional dependencies) adds considerable overhead network > traffic and startup time when frequently running spark Applications on a > Mesos cluster. Additionally, since extracted files like > {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox > with the cache off (rather than extracted directly without an extra copy), > this can considerably increase disk usage. Users CAN currently workaround by > specifying the {{spark.mesos.fetchCache.enable}} option, but this should at > least be specified in the documentation. > SUGGESTED FIX: > Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - > 2.4, and update {{MesosClusterScheduler.scala}} to use > {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line > change). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
[ https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762499#comment-16762499 ] Dongjoon Hyun commented on SPARK-26082: --- Since this bug is introduced by SPARK-15994 which is added Spark 2.1.0, I removed 2.0.x from the affected versions. > Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler > --- > > Key: SPARK-26082 > URL: https://issues.apache.org/jira/browse/SPARK-26082 > Project: Spark > Issue Type: Bug > Components: Mesos >Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, > 2.3.1, 2.3.2 >Reporter: Martin Loncaric >Priority: Major > > Currently in > [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]: > {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs > (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the > Mesos Fetcher Cache > {quote} > Currently in {{MesosClusterScheduler.scala}} (which passes parameter to > driver): > {{private val useFetchCache = > conf.getBoolean("spark.mesos.fetchCache.enable", false)}} > Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos > caching parameter to executors): > {{private val useFetcherCache = > conf.getBoolean("spark.mesos.fetcherCache.enable", false)}} > This naming discrepancy dates back to version 2.0.0 > ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]). > This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, > the Mesos cache will be used only for executors, and not for drivers. > IMPACT: > Not caching these driver files (typically including at least spark binaries, > custom jar, and additional dependencies) adds considerable overhead network > traffic and startup time when frequently running spark Applications on a > Mesos cluster. Additionally, since extracted files like > {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox > with the cache off (rather than extracted directly without an extra copy), > this can considerably increase disk usage. Users CAN currently workaround by > specifying the {{spark.mesos.fetchCache.enable}} option, but this should at > least be specified in the documentation. > SUGGESTED FIX: > Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - > 2.4, and update {{MesosClusterScheduler.scala}} to use > {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line > change). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
[ https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762496#comment-16762496 ] Apache Spark commented on SPARK-26082: -- User 'mwlon' has created a pull request for this issue: https://github.com/apache/spark/pull/23734 > Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler > --- > > Key: SPARK-26082 > URL: https://issues.apache.org/jira/browse/SPARK-26082 > Project: Spark > Issue Type: Bug > Components: Mesos >Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, > 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2 >Reporter: Martin Loncaric >Priority: Major > > Currently in > [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]: > {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs > (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the > Mesos Fetcher Cache > {quote} > Currently in {{MesosClusterScheduler.scala}} (which passes parameter to > driver): > {{private val useFetchCache = > conf.getBoolean("spark.mesos.fetchCache.enable", false)}} > Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos > caching parameter to executors): > {{private val useFetcherCache = > conf.getBoolean("spark.mesos.fetcherCache.enable", false)}} > This naming discrepancy dates back to version 2.0.0 > ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]). > This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, > the Mesos cache will be used only for executors, and not for drivers. > IMPACT: > Not caching these driver files (typically including at least spark binaries, > custom jar, and additional dependencies) adds considerable overhead network > traffic and startup time when frequently running spark Applications on a > Mesos cluster. Additionally, since extracted files like > {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox > with the cache off (rather than extracted directly without an extra copy), > this can considerably increase disk usage. Users CAN currently workaround by > specifying the {{spark.mesos.fetchCache.enable}} option, but this should at > least be specified in the documentation. > SUGGESTED FIX: > Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - > 2.4, and update {{MesosClusterScheduler.scala}} to use > {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line > change). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org