[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814094#comment-16814094 ] KaiXu commented on SPARK-27289: --- I have verified that the intermediate data is written to spark.local.dir which is configured in spark-default.conf, while the value set through --conf will show on the web UI. That means the value through --conf will not override the value in spark-default.conf, what's more, the value shows on the Web UI is not the real value where it really works(the UI shows the value through --conf, but the real working dir is the value in spark-defaul.conf ). [~Udbhav Agrawal], I'm using Spark2.3.3, not know if it's the matter. > spark-submit explicit configuration does not take effect but Spark UI shows > it's effective > -- > > Key: SPARK-27289 > URL: https://issues.apache.org/jira/browse/SPARK-27289 > Project: Spark > Issue Type: Bug > Components: Deploy, Documentation, Spark Submit, Web UI >Affects Versions: 2.3.3 >Reporter: KaiXu >Priority: Minor > Attachments: Capture.PNG > > > The [doc > |https://spark.apache.org/docs/latest/submitting-applications.html]says that > "In general, configuration values explicitly set on a {{SparkConf}} take the > highest precedence, then flags passed to {{spark-submit}}, then values in the > defaults file", but when setting spark.local.dir through --conf with > spark-submit, it still uses the values from > ${SPARK_HOME}/conf/spark-defaults.conf, what's more, the Spark runtime UI > environment variables shows the value from --conf, which is really misleading. > e.g. > I set submit my application through the command: > /opt/spark233/bin/spark-submit --properties-file /opt/spark.conf --conf > spark.local.dir=/tmp/spark_local -v --class > org.apache.spark.examples.mllib.SparseNaiveBayes --master > spark://bdw-slave20:7077 > /opt/sparkbench/assembly/target/sparkbench-assembly-7.1-SNAPSHOT-dist.jar > hdfs://bdw-slave20:8020/Bayes/Input > > the spark.local.dir in ${SPARK_HOME}/conf/spark-defaults.conf is: > spark.local.dir=/mnt/nvme1/spark_local > when the application is running, I found the intermediate shuffle data was > wrote to /mnt/nvme1/spark_local, which is set through > ${SPARK_HOME}/conf/spark-defaults.conf, but the Web UI shows that the > environment value spark.local.dir=/tmp/spark_local. > The spark-submit verbose also shows spark.local.dir=/tmp/spark_local, it's > misleading. > > !image-2019-03-27-10-59-38-377.png! > spark-submit verbose: > > Spark properties used, including those specified through > --conf and those from the properties file /opt/spark.conf: > (spark.local.dir,/tmp/spark_local) > (spark.default.parallelism,132) > (spark.driver.memory,10g) > (spark.executor.memory,352g) > X -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812242#comment-16812242 ] Udbhav Agrawal commented on SPARK-27289: yes intermediate data is written in the spark.local.dir which is configured through --conf parameter while running spark-submit, it will overwrite the one you have mentioned in spark-default.conf > spark-submit explicit configuration does not take effect but Spark UI shows > it's effective > -- > > Key: SPARK-27289 > URL: https://issues.apache.org/jira/browse/SPARK-27289 > Project: Spark > Issue Type: Bug > Components: Deploy, Documentation, Spark Submit, Web UI >Affects Versions: 2.3.3 >Reporter: KaiXu >Priority: Minor > Attachments: Capture.PNG > > > The [doc > |https://spark.apache.org/docs/latest/submitting-applications.html]says that > "In general, configuration values explicitly set on a {{SparkConf}} take the > highest precedence, then flags passed to {{spark-submit}}, then values in the > defaults file", but when setting spark.local.dir through --conf with > spark-submit, it still uses the values from > ${SPARK_HOME}/conf/spark-defaults.conf, what's more, the Spark runtime UI > environment variables shows the value from --conf, which is really misleading. > e.g. > I set submit my application through the command: > /opt/spark233/bin/spark-submit --properties-file /opt/spark.conf --conf > spark.local.dir=/tmp/spark_local -v --class > org.apache.spark.examples.mllib.SparseNaiveBayes --master > spark://bdw-slave20:7077 > /opt/sparkbench/assembly/target/sparkbench-assembly-7.1-SNAPSHOT-dist.jar > hdfs://bdw-slave20:8020/Bayes/Input > > the spark.local.dir in ${SPARK_HOME}/conf/spark-defaults.conf is: > spark.local.dir=/mnt/nvme1/spark_local > when the application is running, I found the intermediate shuffle data was > wrote to /mnt/nvme1/spark_local, which is set through > ${SPARK_HOME}/conf/spark-defaults.conf, but the Web UI shows that the > environment value spark.local.dir=/tmp/spark_local. > The spark-submit verbose also shows spark.local.dir=/tmp/spark_local, it's > misleading. > > !image-2019-03-27-10-59-38-377.png! > spark-submit verbose: > > Spark properties used, including those specified through > --conf and those from the properties file /opt/spark.conf: > (spark.local.dir,/tmp/spark_local) > (spark.default.parallelism,132) > (spark.driver.memory,10g) > (spark.executor.memory,352g) > X -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812103#comment-16812103 ] KaiXu commented on SPARK-27289: --- Do you check where the intermediate shuffle data was wrote while changing the spark.local.dir? BTW, changes in spark-defaults.conf seems need to restart to take effect. > spark-submit explicit configuration does not take effect but Spark UI shows > it's effective > -- > > Key: SPARK-27289 > URL: https://issues.apache.org/jira/browse/SPARK-27289 > Project: Spark > Issue Type: Bug > Components: Deploy, Documentation, Spark Submit, Web UI >Affects Versions: 2.3.3 >Reporter: KaiXu >Priority: Minor > Attachments: Capture.PNG > > > The [doc > |https://spark.apache.org/docs/latest/submitting-applications.html]says that > "In general, configuration values explicitly set on a {{SparkConf}} take the > highest precedence, then flags passed to {{spark-submit}}, then values in the > defaults file", but when setting spark.local.dir through --conf with > spark-submit, it still uses the values from > ${SPARK_HOME}/conf/spark-defaults.conf, what's more, the Spark runtime UI > environment variables shows the value from --conf, which is really misleading. > e.g. > I set submit my application through the command: > /opt/spark233/bin/spark-submit --properties-file /opt/spark.conf --conf > spark.local.dir=/tmp/spark_local -v --class > org.apache.spark.examples.mllib.SparseNaiveBayes --master > spark://bdw-slave20:7077 > /opt/sparkbench/assembly/target/sparkbench-assembly-7.1-SNAPSHOT-dist.jar > hdfs://bdw-slave20:8020/Bayes/Input > > the spark.local.dir in ${SPARK_HOME}/conf/spark-defaults.conf is: > spark.local.dir=/mnt/nvme1/spark_local > when the application is running, I found the intermediate shuffle data was > wrote to /mnt/nvme1/spark_local, which is set through > ${SPARK_HOME}/conf/spark-defaults.conf, but the Web UI shows that the > environment value spark.local.dir=/tmp/spark_local. > The spark-submit verbose also shows spark.local.dir=/tmp/spark_local, it's > misleading. > > !image-2019-03-27-10-59-38-377.png! > spark-submit verbose: > > Spark properties used, including those specified through > --conf and those from the properties file /opt/spark.conf: > (spark.local.dir,/tmp/spark_local) > (spark.default.parallelism,132) > (spark.driver.memory,10g) > (spark.executor.memory,352g) > X -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810627#comment-16810627 ] Udbhav Agrawal commented on SPARK-27289: [~KaiXu] for me it is coming correct and could not reproduce this > spark-submit explicit configuration does not take effect but Spark UI shows > it's effective > -- > > Key: SPARK-27289 > URL: https://issues.apache.org/jira/browse/SPARK-27289 > Project: Spark > Issue Type: Bug > Components: Deploy, Documentation, Spark Submit, Web UI >Affects Versions: 2.3.3 >Reporter: KaiXu >Priority: Minor > Attachments: Capture.PNG > > > The [doc > |https://spark.apache.org/docs/latest/submitting-applications.html]says that > "In general, configuration values explicitly set on a {{SparkConf}} take the > highest precedence, then flags passed to {{spark-submit}}, then values in the > defaults file", but when setting spark.local.dir through --conf with > spark-submit, it still uses the values from > ${SPARK_HOME}/conf/spark-defaults.conf, what's more, the Spark runtime UI > environment variables shows the value from --conf, which is really misleading. > e.g. > I set submit my application through the command: > /opt/spark233/bin/spark-submit --properties-file /opt/spark.conf --conf > spark.local.dir=/tmp/spark_local -v --class > org.apache.spark.examples.mllib.SparseNaiveBayes --master > spark://bdw-slave20:7077 > /opt/sparkbench/assembly/target/sparkbench-assembly-7.1-SNAPSHOT-dist.jar > hdfs://bdw-slave20:8020/Bayes/Input > > the spark.local.dir in ${SPARK_HOME}/conf/spark-defaults.conf is: > spark.local.dir=/mnt/nvme1/spark_local > when the application is running, I found the intermediate shuffle data was > wrote to /mnt/nvme1/spark_local, which is set through > ${SPARK_HOME}/conf/spark-defaults.conf, but the Web UI shows that the > environment value spark.local.dir=/tmp/spark_local. > The spark-submit verbose also shows spark.local.dir=/tmp/spark_local, it's > misleading. > > !image-2019-03-27-10-59-38-377.png! > spark-submit verbose: > > Spark properties used, including those specified through > --conf and those from the properties file /opt/spark.conf: > (spark.local.dir,/tmp/spark_local) > (spark.default.parallelism,132) > (spark.driver.memory,10g) > (spark.executor.memory,352g) > X -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802441#comment-16802441 ] Udbhav Agrawal commented on SPARK-27289: thanks, i will check this issue > spark-submit explicit configuration does not take effect but Spark UI shows > it's effective > -- > > Key: SPARK-27289 > URL: https://issues.apache.org/jira/browse/SPARK-27289 > Project: Spark > Issue Type: Bug > Components: Deploy, Documentation, Spark Submit, Web UI >Affects Versions: 2.3.3 >Reporter: KaiXu >Priority: Major > Attachments: Capture.PNG > > > The [doc > |https://spark.apache.org/docs/latest/submitting-applications.html]says that > "In general, configuration values explicitly set on a {{SparkConf}} take the > highest precedence, then flags passed to {{spark-submit}}, then values in the > defaults file", but when setting spark.local.dir through --conf with > spark-submit, it still uses the values from > ${SPARK_HOME}/conf/spark-defaults.conf, what's more, the Spark runtime UI > environment variables shows the value from --conf, which is really misleading. > e.g. > I set submit my application through the command: > /opt/spark233/bin/spark-submit --properties-file /opt/spark.conf --conf > spark.local.dir=/tmp/spark_local -v --class > org.apache.spark.examples.mllib.SparseNaiveBayes --master > spark://bdw-slave20:7077 > /opt/sparkbench/assembly/target/sparkbench-assembly-7.1-SNAPSHOT-dist.jar > hdfs://bdw-slave20:8020/Bayes/Input > > the spark.local.dir in ${SPARK_HOME}/conf/spark-defaults.conf is: > spark.local.dir=/mnt/nvme1/spark_local > when the application is running, I found the intermediate shuffle data was > wrote to /mnt/nvme1/spark_local, which is set through > ${SPARK_HOME}/conf/spark-defaults.conf, but the Web UI shows that the > environment value spark.local.dir=/tmp/spark_local. > The spark-submit verbose also shows spark.local.dir=/tmp/spark_local, it's > misleading. > > !image-2019-03-27-10-59-38-377.png! > spark-submit verbose: > > Spark properties used, including those specified through > --conf and those from the properties file /opt/spark.conf: > (spark.local.dir,/tmp/spark_local) > (spark.default.parallelism,132) > (spark.driver.memory,10g) > (spark.executor.memory,352g) > X -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org