[ https://issues.apache.org/jira/browse/SPARK-32227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-32227: ------------------------------------ Assignee: (was: Apache Spark) > Bug in load-spark-env.cmd with Spark 3.0.0 > ------------------------------------------- > > Key: SPARK-32227 > URL: https://issues.apache.org/jira/browse/SPARK-32227 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Affects Versions: 3.0.0 > Environment: Windows 10 > Reporter: Ihor Bobak > Priority: Major > Fix For: 3.0.1 > > Attachments: load-spark-env.cmd > > > spark-env.cmd which is located in conf is not loaded by load-spark-env.cmd. > > *How to reproduce:* > 1) download spark 3.0.0 without hadoop and extract it > 2) put a file conf/spark-env.cmd with the following contents (paths are > relative to where my hadoop is - in C:\opt\hadoop\hadoop-3.2.1, you may need > to change): > > SET JAVA_HOME=C:\opt\Java\jdk1.8.0_241 > SET HADOOP_HOME=C:\opt\hadoop\hadoop-3.2.1 > SET HADOOP_CONF_DIR=C:\opt\hadoop\hadoop-3.2.1\conf > SET > SPARK_DIST_CLASSPATH=C:\opt\hadoop\hadoop-3.2.1\etc\hadoop;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce* > > 3) go to the bin directory and run pyspark. You will get an error that > log4j can't be found, etc. (reason: the environment was not loaded indeed, it > doesn't see where hadoop with all its jars is). > > *How to fix:* > just take the load-spark-env.cmd from Spark version 2.4.3, and everything > will work. > [UPDATE]: I attached a fixed version of load-spark-env.cmd that works fine. > > *What is the difference?* > I am not a good specialist in Windows batch, but doing a function > :LoadSparkEnv > if exist "%SPARK_CONF_DIR%\spark-env.cmd" ( > call "%SPARK_CONF_DIR%\spark-env.cmd" > ) > and then calling it (as it was in 2.4.3) helps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org