[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67507324 Looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3409 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67365938 @tgravescs @andrewor14 Can you help to commit the change to upstream, if you don't have any other concerns? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67220945 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67221307 [Test build #24504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24504/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67234008 [Test build #24504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24504/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67035255 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67035890 [Test build #24463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24463/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67046289 [Test build #24463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24463/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67046304 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24463/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21750834 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -358,6 +358,21 @@ private[spark] trait ClientBase extends Logging { if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) } +} else { + // Validate and include yarn am specific java options in yarn-client mode. + val amOptsKey = spark.yarn.am.extraJavaOptions + val amOpts = sparkConf.getOption(amOptsKey) + amOpts.foreach { opts = +if (opts.contains(-Dspark)) { + val msg = s$amOptsKey is not allowed to set Spark options (was '$opts'). + throw new SparkException(msg) +} +if (opts.contains(-Xmx) || opts.contains(-Xms)) { + val msg = s$amOptsKey is not allowed to alter memory settings (was '$opts'). + throw new SparkException(msg) +} +javaOpts += opts --- End diff -- @andrewor14 perhaps we should file a jira to fix up the other places extraJavaOptions are used but don't do it this way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66813628 LGTM any more comments @tgravescs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21763050 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -358,6 +358,21 @@ private[spark] trait ClientBase extends Logging { if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) } +} else { + // Validate and include yarn am specific java options in yarn-client mode. + val amOptsKey = spark.yarn.am.extraJavaOptions + val amOpts = sparkConf.getOption(amOptsKey) + amOpts.foreach { opts = +if (opts.contains(-Dspark)) { + val msg = s$amOptsKey is not allowed to set Spark options (was '$opts'). + throw new SparkException(msg) +} +if (opts.contains(-Xmx) || opts.contains(-Xms)) { + val msg = s$amOptsKey is not allowed to alter memory settings (was '$opts'). + throw new SparkException(msg) +} +javaOpts += opts --- End diff -- actually I think everywhere else `extraJavaOptions` is already treated properly with `splitCommandString`. What I'm saying is that here we introduce a similar option, and we should handle the same way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21763504 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -352,12 +352,28 @@ private[spark] trait ClientBase extends Logging { if (isLaunchingDriver) { sparkConf.getOption(spark.driver.extraJavaOptions) .orElse(sys.env.get(SPARK_JAVA_OPTS)) +.map(Utils.splitCommandString).getOrElse(Seq.empty) .foreach(opts = javaOpts += opts) val libraryPaths = Seq(sys.props.get(spark.driver.extraLibraryPath), sys.props.get(spark.driver.libraryPath)).flatten if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) --- End diff -- by the way we should log a warning or something if the we're in this case and we detect `spark.yarn.am.extraJavaOptions`, since that won't actually take effect. @tgravescs do you think it's better to log the warning or fail fast by throwing an exception here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21763663 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -352,12 +352,28 @@ private[spark] trait ClientBase extends Logging { if (isLaunchingDriver) { sparkConf.getOption(spark.driver.extraJavaOptions) .orElse(sys.env.get(SPARK_JAVA_OPTS)) +.map(Utils.splitCommandString).getOrElse(Seq.empty) .foreach(opts = javaOpts += opts) val libraryPaths = Seq(sys.props.get(spark.driver.extraLibraryPath), sys.props.get(spark.driver.libraryPath)).flatten if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) --- End diff -- I'm not strictly against the warning, but that would print a log for every app whenever someone tries to configure things by default for all use cases by setting both options in spark-defaults.conf... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66828213 LGTM. I can go either way on the warning. It might be useful just to print a single line warning stating its not going to be used. 1 log line on executor startup doesn't seem to bad if they do include both in the spark-defaults.conf --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66855013 Warning is OK to me, but I am against throwing exception. Will update it with logWarning(spark.yarn.am.extraJavaOptions will not take effect in driver mode) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66855785 [Test build #24421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24421/consoleFull) for PR 3409 at commit [`08f44a7`](https://github.com/apache/spark/commit/08f44a7a6c8086071fc9e136bf3ffd2b02a2dd27). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66859390 [Test build #24421 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24421/consoleFull) for PR 3409 at commit [`08f44a7`](https://github.com/apache/spark/commit/08f44a7a6c8086071fc9e136bf3ffd2b02a2dd27). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class Analyzer(catalog: Catalog, registry: FunctionRegistry, caseSensitive: Boolean)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66859399 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24421/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66860869 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66860916 [Test build #24427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24427/consoleFull) for PR 3409 at commit [`08f44a7`](https://github.com/apache/spark/commit/08f44a7a6c8086071fc9e136bf3ffd2b02a2dd27). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66863203 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24427/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66863200 [Test build #24427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24427/consoleFull) for PR 3409 at commit [`08f44a7`](https://github.com/apache/spark/commit/08f44a7a6c8086071fc9e136bf3ffd2b02a2dd27). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class Analyzer(catalog: Catalog, registry: FunctionRegistry, caseSensitive: Boolean)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21707097 --- Diff: docs/running-on-yarn.md --- @@ -139,6 +139,15 @@ Most of the configs are the same for Spark on YARN as for other deployment modes The maximum number of threads to use in the application master for launching executor containers. /td /tr +tr + tdcodespark.yarn.am.extraJavaOptions/code/td + td(none)/td + td +A string of extra JVM options to pass to the Application Master in yarn-client mode. For instance, specific +system properties. Note that it is complementary to spark.driver.extraJavaOptions, which are only passed to +the Application Master in yarn-cluster mode. --- End diff -- Let's avoid using `yarn-client` and `yarn-cluster` here. I think in general it's a bad idea to cram the deploy mode into the master string (yes we do it for `--master` but we should really deprecate that). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21707186 --- Diff: docs/running-on-yarn.md --- @@ -139,6 +139,15 @@ Most of the configs are the same for Spark on YARN as for other deployment modes The maximum number of threads to use in the application master for launching executor containers. /td /tr +tr + tdcodespark.yarn.am.extraJavaOptions/code/td + td(none)/td + td +A string of extra JVM options to pass to the Application Master in yarn-client mode. For instance, specific +system properties. Note that it is complementary to spark.driver.extraJavaOptions, which are only passed to +the Application Master in yarn-cluster mode. --- End diff -- Also, I think the latter sentences are confusing. What does it mean to be complementary to another config? We should just rephrase it as follows: ``` A string of extra JVM options to pass to the Yarn ApplicationMaster in client mode. In cluster mode, use spark.driver.extraJavaOptions instead. ``` No need to mention specific system properties --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21707289 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -358,6 +358,21 @@ private[spark] trait ClientBase extends Logging { if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) } +} else { + // Validate and include yarn am specific java options in yarn-client mode. + val amOptsKey = spark.yarn.am.extraJavaOptions + val amOpts = sparkConf.getOption(amOptsKey) + amOpts.foreach { opts = +if (opts.contains(-Dspark)) { + val msg = s$amOptsKey is not allowed to set Spark options (was '$opts'). + throw new SparkException(msg) +} +if (opts.contains(-Xmx) || opts.contains(-Xms)) { + val msg = s$amOptsKey is not allowed to alter memory settings (was '$opts'). + throw new SparkException(msg) +} +javaOpts += opts --- End diff -- what was the verdict on not using `Utils.splitCommandString` again? Why can't we do that here? I think we should because we should keep the format of what `spark.*.extraJavaOptions` accepts consistent --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66697105 @andrewor14 I thought you want to solve the comma-delimited string problem, and the splitCommandString is not for this purpose. I check the code again, and it does help to keep the options input consistent. Have made the change accordingly, including the fix for spark.driver.extraJavaOptions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66697750 [Test build #24381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24381/consoleFull) for PR 3409 at commit [`5a505d3`](https://github.com/apache/spark/commit/5a505d37f2d56efa830432ad800fb8ec71709230). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66712216 **[Test build #24381 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24381/consoleFull)** for PR 3409 at commit [`5a505d3`](https://github.com/apache/spark/commit/5a505d37f2d56efa830432ad800fb8ec71709230) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66712224 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24381/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66716229 retest it please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66721135 test it please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66721562 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66722004 [Test build #24386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24386/consoleFull) for PR 3409 at commit [`5a505d3`](https://github.com/apache/spark/commit/5a505d37f2d56efa830432ad800fb8ec71709230). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66727442 [Test build #24386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24386/consoleFull) for PR 3409 at commit [`5a505d3`](https://github.com/apache/spark/commit/5a505d37f2d56efa830432ad800fb8ec71709230). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66727444 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24386/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66464201 I'm in favor of spark.yarn.am.* and then documenting if it only applies to client mode also. @andrewor14 @sryza votes? Lets try to resolve this today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66501671 Not my preferred option, but sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66509512 Yes I am also in favor of `spark.yarn.am.*` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66533517 spark.yarn.am.* sounds fine to me --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66554918 [Test build #24341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24341/consoleFull) for PR 3409 at commit [`4ed43ad`](https://github.com/apache/spark/commit/4ed43ade4cbd58a237712a1ce8b5e059b505c1b1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66561065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24341/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66561062 [Test build #24341 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24341/consoleFull) for PR 3409 at commit [`4ed43ad`](https://github.com/apache/spark/commit/4ed43ade4cbd58a237712a1ce8b5e059b505c1b1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66339357 Hey regarding the naming, I would actually prefer `spark.yarn.am.*` over `spark.yarn.clientmode.am.*`. Although I agree that it's clearer to have clientmode in there, it's incredibly verbose and inconsistent with the other similar configs that we have (i.e. `spark.driver.extraJavaOptions`, `spark.yarn.executor.memoryOverhead`). If we want to discourage the user from setting these in cluster mode, I think the better approach is to log a warning or even throw an exception if this happens, instead of cramming in `clientmode` in a config name that we can't ever change again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66339412 By the way you will need to rebase this to master because we just removed support for yarn alpha. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66340028 How about a compromise: `spark.yarn.clientam.*`? I don't like just am because, while terse, it's ambiguous. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21554124 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -360,6 +360,10 @@ private[spark] trait ClientBase extends Logging { } } +// include yarn am specific java options +sparkConf.getOption(spark.yarn.am.extraJavaOptions) + .foreach(opts = javaOpts += opts) --- End diff -- Sorry @zhzhan I don't understand. How we specify extra java options in the AM should be the same as how we specify extra java options in the executors or the driver. If it applies to the executor java options there, why doesn't it apply here? Also, why do you assume that the java options should be comma separated? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66341713 `clientam` is fine, though maybe we should camel case it `clientAm`. Though I still prefer keeping it terse as `spark.yarn.am.*` and just throw an exception or log a warning if the user sets this in cluster mode. Right now the name of the config itself is kind of tied to the details of how we implement the Spark on Yarn integration, and we won't be able to change this easily once it becomes public. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66342475 By the way, I just dug into the codebase and found that we already have the following: ``` spark.yarn.appMasterEnv spark.yarn.applicationMaster.waitTries ``` Both of which are documented. Perhaps we should collectively decide on the appropriate naming schemes before introducing new ones, and deprecate all the old ones in favor of the new ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66343795 the spark.yarn.applicationMaster.waitTries is being changed in a pull request to change it from tries to time and there is another one to add memory overhead param. I was trying to make all 3 of those use the same convention. So lets decide on it here and we can enforce in those. The only one not covered is the spark.yarn.appMasterEnv --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66362159 I prefer spark.yarn.am.* Currently because the yarn-client and yarn-cluster mode, it is very difficult to remove all the confusions. By the way, we can log warning, but not throw exception if in cluster mode, because user may just want to reuse the same configuration in both client and cluster mode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66404905 Use spark.yarn.am.* and describe the scope of configuration items in docs is better. In cluster mode we should ignore the unused configs, probably also add a warning log. Throwing an exception is a bad idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66134213 Its a matter of whats more obvious to the user who doesn't necessarily read the documentation. Adding in clientmode hopefully helps the user realize this config only does something in yarn-client mode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21478911 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -358,6 +358,21 @@ private[spark] trait ClientBase extends Logging { if (libraryPaths.nonEmpty) { prefixEnv = Some(Utils.libraryPathEnvPrefix(libraryPaths)) } +} else { + // Validate and include yarn am specific java options in yarn-client mode. + val amOptsKey = spark.yarn.clientmode.am.extraJavaOptions + val amOpts = sparkConf.getOption(amOptsKey) + amOpts.map { javaOpts = --- End diff -- I'd just simplify this as: sparkConf.getOption(amOptsKey).foreach { opts = // validate // javaOpts += opts } Hint: `map()` is more expensive than `foreach()` in general (because it returns something, unlike foreach). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66174011 LGTM aside from minor style issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66202544 [Test build #24232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24232/consoleFull) for PR 3409 at commit [`e3f9abe`](https://github.com/apache/spark/commit/e3f9abeaa82018835cd9a7055adba0dabc451a24). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66212407 [Test build #24232 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24232/consoleFull) for PR 3409 at commit [`e3f9abe`](https://github.com/apache/spark/commit/e3f9abeaa82018835cd9a7055adba0dabc451a24). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-66212417 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24232/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65939827 If we use `spark.yarn.clientmode.am.extraJavaOptions`, then should tell users Application Master related configurations contains `clientmode` is only active on yarn-client mode. AM will use corresponding items on yarn-cluster mode. If we use `spark.yarn.am.extraJavaOptions` instead, we should tell users same thing. Why don't we keep the name shorter? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65812731 I'm personally not a fan of executorLauncher. Cluster mode also launches executors and users shouldn't really have to know executorLauncher = client mode. If you want to change the name then we should be explicit this is for yarn client mode only then (spark.yarn.clientmode.am.extraJavaOptions or similar). Otherwise I don't see how you get around some sort of confusion. You can make it apply to both but then you have to define presendence between it and driver.extraJavOptions and if users have both set and don't realize the presendence they get unexpected behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65826773 @tgravescs that makes sense. clientmode.am sounds good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65858141 [Test build #24194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24194/consoleFull) for PR 3409 at commit [`dea1692`](https://github.com/apache/spark/commit/dea169261811c5eef436a43dd6f1a3df68fbde3e). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65862808 [Test build #24196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24196/consoleFull) for PR 3409 at commit [`8963552`](https://github.com/apache/spark/commit/8963552ae6ffe14c34e45ff5be562de10154e05b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65870516 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24194/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65870513 [Test build #24194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24194/consoleFull) for PR 3409 at commit [`dea1692`](https://github.com/apache/spark/commit/dea169261811c5eef436a43dd6f1a3df68fbde3e). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65871549 [Test build #24196 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24196/consoleFull) for PR 3409 at commit [`8963552`](https://github.com/apache/spark/commit/8963552ae6ffe14c34e45ff5be562de10154e05b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65871555 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24196/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21309157 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -360,6 +360,10 @@ private[spark] trait ClientBase extends Logging { } } +// include yarn am specific java options +sparkConf.getOption(spark.yarn.am.extraJavaOptions) + .foreach(opts = javaOpts += opts) --- End diff -- so currently this affects both cluster and client mode since driver.extraJavaOptions applies in cluster mode I think we should make this only apply in client mode. Otherwise we should define precendence between it and the driver.extraJavaOptions in driver mode or potentially error if they aren't set. It seems most straight forward to only have it apply in client mode but I'm open to thoughts - @vanzin @andrewor14 . Also we should run this through the check like in SparkConf for spark.executor.extraJavaOptions to make sure no spark configs or -Xmx is set. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65651760 Agree with @tgravescs, adding spark.yarn.am.extraClassPath and spark.yarn.am.extraLibraryPath together would be better. @zhzhan You can also check https://issues.apache.org/jira/browse/SPARK-4181. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65730190 [Test build #24154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24154/consoleFull) for PR 3409 at commit [`092a25f`](https://github.com/apache/spark/commit/092a25ffb37b0f0edf773fbe0c088d4d0765c1b9). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65734485 [Test build #24156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24156/consoleFull) for PR 3409 at commit [`90d5dff`](https://github.com/apache/spark/commit/90d5dff3f5e463b3d507afd339f6e15b9476cbda). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65739136 [Test build #24154 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24154/consoleFull) for PR 3409 at commit [`092a25f`](https://github.com/apache/spark/commit/092a25ffb37b0f0edf773fbe0c088d4d0765c1b9). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65739140 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24154/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65740495 [Test build #24156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24156/consoleFull) for PR 3409 at commit [`90d5dff`](https://github.com/apache/spark/commit/90d5dff3f5e463b3d507afd339f6e15b9476cbda). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65740504 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24156/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65744107 If this only applies in client mode can we call is spark.yarn.executorLauncher.extraJavaOptions? am would make me think it applies in cluster mode as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65241674 memory should not be specified through the java options (I filed a separate jira to support that - I would expect to have something like spark.yarn.am.memory for that). Like I mentioned in one of my original comments we should do a check like SparkConf has. I can see valid use cases for wanting to set the java options or classpath for an application master differently then the driver and I don't see any harm in allowing users to do that. Most users wouldn't have to set anything for these. The reason I brought up doing all 3 at once is I don't want to do one and then 2 weeks later someone ask for the other. I would rather have it consistent across executor/driver/am. But since there has been so much discussion back and forth perhaps just the java options for now and see if there is a need for the others. I will file a separate jira to look at making the driver.extra* apply similarly across the board also. I find it inconsistent that the driver.extraClassPath applies to AM but the other don't. @vanzin I see what you were getting at now. You were just wondering if we should do something specific to hadoop configs. The thing is you can do a lot of stuff in the configs and its deployment specific. Did you have something in mind? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65117625 @tgravescs @andrewor14 This is yarn specific change, and the patch is quite simple, but important for hadoop compatibility. Please take a look and let me know if you have any concerns. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65123496 One thing you mention in the jira is classpath and how it reads it from mapred-site.xml. Don't you want to add a equivalent of spark.executor.extraClassPath for the AM then too? classpath should be done through config like that and not straight though the javaOptions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65124155 hmm, actually looking at the code some more it appears that the driver.extraClassPath applies to both client and cluster mode. We should really be consistent across these configs. (the third one is spark.driver.extraLibraryPath). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21114689 --- Diff: docs/running-on-yarn.md --- @@ -132,6 +132,15 @@ Most of the configs are the same for Spark on YARN as for other deployment modes The maximum number of threads to use in the application master for launching executor containers. /td /tr +tr + tdcodespark.yarn.am.extraJavaOptions/code/td + td(none)/td + td +The java options passed to the Application Master process launched on YARN in both yarn-client +and yarn-cluster mdoe. It is different from spark.driver.extraJavaOptions, as the driver options --- End diff -- spelling of mode. It would also be nice to have the initial text similar to the text for spark.executor.extraJavaOptions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65131243 So I think we should do all 3 of these configs: spark.yarn.am.extraJavaOptions, spark.yarn.am.extraClassPath, and spark.yarn.am.extraLibraryPath and make then behave similarly. We should change the driver.extraClassPath to only apply in cluster mode so it matches the behavior of the others. I also think it might be easiest to only have these apply in client mode. Otherwise we have a conflict between driver.extra* and am.extra* and we would either error or define the precendence each one takes. Although I would be ok with it applying to both client and cluster mode if someone has specific concerns as long as we define the behavior. There is also code in SparkConf that validates the user isn't trying to set spark configs and memory size through the java options, we should probably run this through similar validations. I know that would expand this jiras scope but I think it would be best to be done together. thoughts or concerns? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65132612 I'm still a little confused about what exactly @zhzhan is trying to achieve, but pending my reading of the bug / code again, what would be the point of exposing things like `extraClassPath` and `extraLibraryPath` to the client-mode AM? That AM is not running user code, so it shouldn't need any extra entries in its classpath or native lib path, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65137532 @vanzin This specific patch is to pass extra java options to AM. The main reason is that AM parses mapred-site.xml and need the system property to correctly do it. Please take a look at the pull description on why this is needed. @tgravescs I agree with @vanzin, and don't see the scenario where AM need extraClassPah and extraLibraryPath. Can you please leverage it a little more? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65138030 @zhzhan I guess my main confusion is why do you need to add user-visible options for that instead of just automatically propagating these properties somehow? Users do not need to set these properties today, so this would be a usability regression if these properties are actually needed on the AM side. Also, it's always weird to talk about the AM since there are two different modes that behave differently (client mode, where the AM is mostly doing not much, and cluster, where the usual `spark.driver.*` options apply to the AM). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65140498 Oh I think I misread the jira. I thought you were wanting to modify the classpath but you are really just wanting to specify jvm options that are used in things like mapred-site.xml for the classpath. I do think its weird that we are not consistent across the use of driver.extra* variables though. There are definitely cases where the client AM doesn't need the same settings as the driver. For instance memory size. The AM doesn't need to be very large at all but the driver may need to be. ( I have different jira for that case). Similarly I could see java options being different between these and really the classpath also. If you are running in yarn client mode, the driver may have a bunch of other things in its classpath that don't apply to the application master at all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65140650 @vanzin As described in the PR, if the mapred-site.xml use the system property, it will break spark. It is not usability regression, as it does not impact any current behavior. How to use it depends on different hadoop distribution, and I don't think it can be automatically propagated, as it is related to how different hadoop distributions wrap their specific mapred-site.xml. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65142485 In that case wouldn't it break for all the containers, not just for the AM? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65143311 @vanzin Only AM parse the mapred-site.xml/yarn-site.xml, etc. In the meantime, even if executor needs extra java options, there have been a configuration (spark.executor.extraJavaOptions) to specify it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65143862 @zhzhan which place in ApplicationMaster are you saying parses mapred-site.xml? I thought you were referring to the ClientBase where it calls populateHadoopClasspath but it doesn't sound like it. Perhaps its the use of one of our other YARN calls? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65144413 @tgravescs those files will be parsed when you create a `YarnConfiguration` object. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21124376 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -360,6 +360,10 @@ private[spark] trait ClientBase extends Logging { } } +// include yarn am specific java options +sparkConf.getOption(spark.yarn.am.extraJavaOptions) + .foreach(opts = javaOpts += opts) --- End diff -- you will need to first break this up into a list of java options. Right now this adds the whole comma-delimited string as one giant java option. See how this is done in `SparkDeploySchedulerBackend` using `Utils.splitCommandString` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65144982 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65144899 @zhzhan sorry I should have expanded a little bit on my questions. What I'm trying to get at is two things: * if mapred-site.xml supports referencing system properties, the same most probably applies to hdfs-site.xml and core-site.xml, right? And those are read by the executors. * it feels weird to add an AM-specific option that would be used in two different contexts. In client mode, you'd add these extra arguments to the AM. In cluster mode, you'd add these plus `spark.driver.extraJavaOptions`. And if the bullet above is correct, you'd need to duplicate the same properties in the executor-related config option. So I wonder if there shouldn't be a single option that could apply to all three? Also, also assuming the first bullet above is correct, then this ceases to be exclusive to Yarn, and both standalone and mesos could benefit from the same feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65146117 @andrewor14 Thanks for the comments. I noticed that spark.driver.extraJavaOptions also have the same problem then, I will fix both of them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65146066 [Test build #24005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24005/consoleFull) for PR 3409 at commit [`bc5a9ae`](https://github.com/apache/spark/commit/bc5a9ae330ce1092899adf711dc214651468ff43). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65146732 I understand they are parsed when you create YarnConfiguration, I'm wondering what the spark applicationMaster is using from those files that is causing this particular issue? mapred-site.xml keeps being mentioned and we should be minimizing and eliminating the use of mapred-site.xml whereever possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r21125348 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -360,6 +360,10 @@ private[spark] trait ClientBase extends Logging { } } +// include yarn am specific java options +sparkConf.getOption(spark.yarn.am.extraJavaOptions) + .foreach(opts = javaOpts += opts) --- End diff -- Just FYI, this works because of how this is used in Yarn. Yarn doesn't just do a `System.exec()`, it creates a shell script that then executes this command. So this string gets added to the command as is, so breaking it up just means it will be rebuilt the same way again in the shell script later on. That being said, yeah, it's kinda fishy to depend on Yarn's internal behavior like that, so it's probably safer to change this (and the other place Zhan identified). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-65149255 @vanzin I'm not really following where you are going with the questions? Spark started out having many configs be the same across everything but there are many cases they could and should be different so split them apart again. Having things separate allow for more flexibility. Yes you may have to duplicate a setting but it shouldn't be a big deal to put it in 2 places in the configs. If other files (hdfs-site, core-site) support them they can currently be set via executor.extraJavaOptions and driver.extraJavaOptions. The only thing it can't be set on is the ApplicationMaster in client mode. I agree that its a bit weird that the user has to do 2 different things between client and cluster mode but the fact is they run differently. Until we can fix that there are things that might be different. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org