[
https://issues.apache.org/jira/browse/SPARK-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375707#comment-14375707
]
vijay commented on SPARK-6435:
------------------------------
I tested this on Linux with the 1.3.0 release, works fine. Apparently a
windows-specific issue. Apparently on windows only the 1st jar is picked up.
This appears to be a problem with parsing the command line, introduced by the
change in windows scripts between 1.2.0 and 1.3.0. A simple fix to
bin\windows-utils.cmd resolves the issue.
I ran this command to test with 'real' jars:
{code}
%SPARK_HOME%\bin\spark-shell --master local --jars
c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar,c:\temp\guava-14.0.1.jar
{code}
Here are some snippets from the console - note that only the 1st jar is added;
I can load classes from the 1st jar but not the 2nd:
{code}
15/03/23 10:57:41 INFO SparkUI: Started SparkUI at http://vgarla-t440P.fritz.box
:4040
15/03/23 10:57:41 INFO SparkContext: Added JAR
file:/c:/code/elasticsearch-1.4.2/lib/lucene-core-4.10.2.jar at
http://192.168.178.41:54601/jars/lucene-core-4.10.2.jar with timestamp
1427104661969
15/03/23 10:57:42 INFO Executor: Starting executor ID <driver> on host localhost
...
scala> import org.apache.lucene.util.IOUtils
import org.apache.lucene.util.IOUtils
scala> import com.google.common.base.Strings
<console>:20: error: object Strings is not a member of package
com.google.common.base
{code}
Looking at the command line in jvisualvm, I see that only the 1st jar is aded:
{code}
Main class: org.apache.spark.deploy.SparkSubmit
Arguments: --class org.apache.spark.repl.Main --master local --jars
c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar spark-shell
c:\temp\guava-14.0.1.jar
{code}
In spark 1.2.0, spark-shell2.cmd just passed arguments "as is" to the java
command line:
{code}
cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class
org.apache.spark.repl.Main %* spark-shell
{code}
In spark 1.3.0, spark-shell2.cmd calls windows-utils.cmd to parse arguments
into SUBMISSION_OPTS and APPLICATION_OPTS. Only the first jar in the list
passed to --jars makes it into the SUBMISSION_OPTS; latter jars are added to
APPLICATION_OPTS:
{code}
call %SPARK_HOME%\bin\windows-utils.cmd %*
if %ERRORLEVEL% equ 1 (
call :usage
exit /b 1
)
echo SUBMISSION_OPTS=%SUBMISSION_OPTS%
echo APPLICATION_OPTS=%APPLICATION_OPTS%
cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class
org.apache.spark.repl.Main %SUBMISSION_OPTS% spark-shell %APPLICATION_OPTS%
{code}
The problem is that by the time the command line arguments get to
windows-utils.cmd, the windows command line processor has split the
comma-separated list into distinct arguments. The windows way of saying "treat
this as a single arg" is to surround in double-quotes. However, when I
surround the jars in quotes, I get an error:
{code}
%SPARK_HOME%\bin\spark-shell --master local --jars
"c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar,c:\temp\guava-14.0.1.jar"
c:\temp\guava-14.0.1.jar""=="x" was unexpected at this time.
{code}
Digging in, I see this is caused by this line from windows-utils.cmd:
{code}
if "x%2"=="x" (
{code}
Replacing the quotes with square brackets does the trick:
{code}
if [x%2]==[x] (
{code}
Now the command line is processed correctly.
> spark-shell --jars option does not add all jars to classpath
> ------------------------------------------------------------
>
> Key: SPARK-6435
> URL: https://issues.apache.org/jira/browse/SPARK-6435
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell
> Affects Versions: 1.3.0
> Environment: Win64
> Reporter: vijay
>
> Not all jars supplied via the --jars option will be added to the driver (and
> presumably executor) classpath. The first jar(s) will be added, but not all.
> To reproduce this, just add a few jars (I tested 5) to the --jars option, and
> then try to import a class from the last jar. This fails. A simple
> reproducer:
> Create a bunch of dummy jars:
> jar cfM jar1.jar log.txt
> jar cfM jar2.jar log.txt
> jar cfM jar3.jar log.txt
> jar cfM jar4.jar log.txt
> Start the spark-shell with the dummy jars and guava at the end:
> %SPARK_HOME%\bin\spark-shell --master local --jars
> jar1.jar,jar2.jar,jar3.jar,jar4.jar,c:\code\lib\guava-14.0.1.jar
> In the shell, try importing from guava; you'll get an error:
> {code}
> scala> import com.google.common.base.Strings
> <console>:19: error: object Strings is not a member of package
> com.google.common.base
> import com.google.common.base.Strings
> ^
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]