[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213783925 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- I took a look at the history of this code and the only constant here is how all places are really confused about the units over time. It seems to me that things shifted back and forth a bit, and that now the behavior actually differs depending on cluster manager: YARN and k8s default to Mb, while others default to bytes. I think it's just safer to not say what the default unit is and encourage people to explicitly identify it; I think that's what's being done for a lot of new time- and size-based configs. The "no unit" approach was always meant to be only for backwards compatibility anyway. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213774632 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- Oh, hm. ``` $ ./bin/spark-shell --master local --driver-memory=1024 Error occurred during initialization of VM Too small initial heap $ ./bin/spark-shell --master local --conf spark.driver.memory=1024 Error occurred during initialization of VM Too small initial heap ``` Although the option referenced above is used, it's used in YARN and K8S, not in this code path. Paging @vanzin for an opinion? I'd prefer to change docs rather than behavior but it looks like we might have divergent behavior here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213708177 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- I think I got the point you want to report @ivoson, IIUC, this is a bug in the code not in doc, we should also make `spark.driver.memory=1024` with the unit of MiB, maybe this change the original behavior and we can announce in migrate guide? cc @srowen @HyukjinKwon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user ivoson commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213546414 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- If we start a job with conf spark.executor.memory=1024 or spark.driver.memory=1024, it means 1024 bytes for now. So I think we should update the doc to avoid confusion for spark users. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user ivoson commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213544524 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- @xuanyuanking @HyukjinKwon @srowen thanks for your reply. I've also noticed the code above, but the 'DRIVER_MEMORY' and 'EXECUTOR_MEMORY' in the config/package.scala never used, maybe this is for future usage I think. The code below shows how the conf is used for now, please take a look. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L465 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L1130 https://github.com/ivoson/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L265 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213524321 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- MiB is correct in both cases, according to the code. I believe this PR should be closed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213522085 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- why MiB is wrong btw? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/22252#discussion_r213343952 --- Diff: docs/configuration.md --- @@ -152,7 +152,7 @@ of the most common options to set are: spark.driver.memory 1g -Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in MiB +Amount of memory to use for the driver process, i.e. where SparkContext is initialized, in bytes --- End diff -- Check the config code here. https://github.com/apache/spark/blob/99d2e4e00711cffbfaee8cb3da9b6b3feab8ff18/core/src/main/scala/org/apache/spark/internal/config/package.scala#L40-L43 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
GitHub user ivoson opened a pull request: https://github.com/apache/spark/pull/22252 [SPARK-25261][MINOR][DOC] correct the default unit for spark.executor|driver.memory as described in configuration.md ## What changes were proposed in this pull request? As described in [SPARK-25261](https://issues.apache.org/jira/projects/SPARK/issues/SPARK-25261)ï¼the unit of spark.executor.memory and spark.driver.memory is bytes if no unit specified, while in https://spark.apache.org/docs/latest/configuration.html#application-properties, they are descibed as MiB, which may lead to some misunderstandings. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/ivoson/spark branch-correct-configuration Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22252.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22252 commit 8f360b97e9f0afff7e6627df15cc17f103a5c2b3 Author: Huang Tengfei Date: 2018-08-28T07:28:06Z update configuration.md, correct the default unit of spark.executor.memory and spark.driver.memory. Change-Id: I2935b93559aa3e016bbab7a083c8c24bbdc6f685 commit 3eb3b66a52435366f258cbb40d01d8f3b0141bff Author: huangtengfei02 Date: 2018-08-28T08:29:25Z fix style issue Change-Id: I73e82b8bd07064d874bf76df52cb661097532884 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org