[jira] [Commented] (SPARK-1850) Bad exception if multiple jars exist when running PySpark

Matthew Farrellee (JIRA) Wed, 02 Jul 2014 06:25:44 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049895#comment-14049895
 ]


Matthew Farrellee commented on SPARK-1850:
------------------------------------------

[~andrewor14] -

i think this should be closed as resolved in SPARK-2242

the current output for the error is,

{noformat}
$ ./dist/bin/pyspark
Python 2.7.5 (default, Feb 19 2014, 13:47:28) 
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
  File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/shell.py", 
line 43, in <module>
    sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
  File 
"/home/matt/Documents/Repositories/spark/dist/python/pyspark/context.py", line 
95, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File 
"/home/matt/Documents/Repositories/spark/dist/python/pyspark/context.py", line 
191, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File 
"/home/matt/Documents/Repositories/spark/dist/python/pyspark/java_gateway.py", 
line 66, in launch_gateway
    raise Exception(error_msg)
Exception: Launching GatewayServer failed with exit code 1!(Warning: unexpected 
output detected.)

Found multiple Spark assembly jars in 
/home/matt/Documents/Repositories/spark/dist/lib:
spark-assembly-1.1.0-SNAPSHOT-hadoop1.0.4-.jar
spark-assembly-1.1.0-SNAPSHOT-hadoop1.0.4.jar
Please remove all but one jar.
{noformat}

> Bad exception if multiple jars exist when running PySpark
> ---------------------------------------------------------
>
>                 Key: SPARK-1850
>                 URL: https://issues.apache.org/jira/browse/SPARK-1850
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.0.0
>            Reporter: Andrew Or
>             Fix For: 1.0.1
>
>
> {code}
> Found multiple Spark assembly jars in 
> /Users/andrew/Documents/dev/andrew-spark/assembly/target/scala-2.10:
> Traceback (most recent call last):
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/shell.py", 
> line 43, in <module>
>     sc = SparkContext(os.environ.get("MASTER", "local[*]"), "PySparkShell", 
> pyFiles=add_files)
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
> line 94, in __init__
>     SparkContext._ensure_initialized(self, gateway=gateway)
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
> line 180, in _ensure_initialized
>     SparkContext._gateway = gateway or launch_gateway()
>   File 
> "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/java_gateway.py", 
> line 49, in launch_gateway
>     gateway_port = int(proc.stdout.readline())
> ValueError: invalid literal for int() with base 10: 
> 'spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4-deps.jar\n'
> {code}
> It's trying to read the Java gateway port as an int from the sub-process' 
> STDOUT. However, what it read was an error message, which is clearly not an 
> int. We should differentiate between these cases and just propagate the 
> original message if it's not an int. Right now, this exception is not very 
> helpful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1850) Bad exception if multiple jars exist when running PySpark

Reply via email to