SparkContext with error from PySpark
Hi Team, I was trying to execute a Pyspark code in cluster. It gives me the following error. (Wne I run the same job in local it is working fine too :-() Eoor Error from python worker: /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py:209: Warning: 'with' will become a reserved keyword in Python 2.6 Traceback (most recent call last): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/runpy.py, line 85, in run_module loader = get_loader(mod_name) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 456, in get_loader return find_loader(fullname) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 466, in find_loader for importer in iter_importers(fullname): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 422, in iter_importers __import__(pkg) File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/__init__.py, line 41, in module from pyspark.context import SparkContext File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py, line 209 with SparkContext._lock: ^ SyntaxError: invalid syntax PYTHONPATH was: /usr/lib/spark-1.2.0-bin-hadoop2.3/python:/usr/lib/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python:/home/beehive/bin/utils/primitives:/home/beehive/bin/utils/pylogger:/home/beehive/bin/utils/asterScript:/home/beehive/bin/lib:/home/beehive/bin/utils/init:/home/beehive/installer/packages:/home/beehive/ncli java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 14/12/31 04:49:58 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, aster4, NODE_LOCAL, 1321 bytes) 14/12/31 04:49:58 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on aster4:43309 (size: 3.8 KB, free: 265.0 MB) 14/12/31 04:49:59 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor aster4: org.apache.spark.SparkException ( Any clue how to resolve the same. Best regards Jagan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-with-error-from-PySpark-tp20907.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkContext with error from PySpark
The Python installed in your cluster is 2.5. You need at least 2.6. Eric Friedman On Dec 30, 2014, at 7:45 AM, Jaggu jagana...@gmail.com wrote: Hi Team, I was trying to execute a Pyspark code in cluster. It gives me the following error. (Wne I run the same job in local it is working fine too :-() Eoor Error from python worker: /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py:209: Warning: 'with' will become a reserved keyword in Python 2.6 Traceback (most recent call last): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/runpy.py, line 85, in run_module loader = get_loader(mod_name) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 456, in get_loader return find_loader(fullname) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 466, in find_loader for importer in iter_importers(fullname): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 422, in iter_importers __import__(pkg) File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/__init__.py, line 41, in module from pyspark.context import SparkContext File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py, line 209 with SparkContext._lock: ^ SyntaxError: invalid syntax PYTHONPATH was: /usr/lib/spark-1.2.0-bin-hadoop2.3/python:/usr/lib/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python:/home/beehive/bin/utils/primitives:/home/beehive/bin/utils/pylogger:/home/beehive/bin/utils/asterScript:/home/beehive/bin/lib:/home/beehive/bin/utils/init:/home/beehive/installer/packages:/home/beehive/ncli java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 14/12/31 04:49:58 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, aster4, NODE_LOCAL, 1321 bytes) 14/12/31 04:49:58 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on aster4:43309 (size: 3.8 KB, free: 265.0 MB) 14/12/31 04:49:59 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor aster4: org.apache.spark.SparkException ( Any clue how to resolve the same. Best regards Jagan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-with-error-from-PySpark-tp20907.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkContext with error from PySpark
Hi I am using Aanonda Python. Is there any way to specify the Python which we have o use for running pyspark in a cluster. Best regards Jagan On Tue, Dec 30, 2014 at 6:27 PM, Eric Friedman eric.d.fried...@gmail.com wrote: The Python installed in your cluster is 2.5. You need at least 2.6. Eric Friedman On Dec 30, 2014, at 7:45 AM, Jaggu jagana...@gmail.com wrote: Hi Team, I was trying to execute a Pyspark code in cluster. It gives me the following error. (Wne I run the same job in local it is working fine too :-() Eoor Error from python worker: /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py:209: Warning: 'with' will become a reserved keyword in Python 2.6 Traceback (most recent call last): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/runpy.py, line 85, in run_module loader = get_loader(mod_name) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 456, in get_loader return find_loader(fullname) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 466, in find_loader for importer in iter_importers(fullname): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 422, in iter_importers __import__(pkg) File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/__init__.py, line 41, in module from pyspark.context import SparkContext File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py, line 209 with SparkContext._lock: ^ SyntaxError: invalid syntax PYTHONPATH was: /usr/lib/spark-1.2.0-bin-hadoop2.3/python:/usr/lib/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python:/home/beehive/bin/utils/primitives:/home/beehive/bin/utils/pylogger:/home/beehive/bin/utils/asterScript:/home/beehive/bin/lib:/home/beehive/bin/utils/init:/home/beehive/installer/packages:/home/beehive/ncli java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 14/12/31 04:49:58 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, aster4, NODE_LOCAL, 1321 bytes) 14/12/31 04:49:58 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on aster4:43309 (size: 3.8 KB, free: 265.0 MB) 14/12/31 04:49:59 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor aster4: org.apache.spark.SparkException ( Any clue how to resolve the same. Best regards Jagan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-with-error-from-PySpark-tp20907.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in
Re: SparkContext with error from PySpark
To configure the Python executable used by PySpark, see the Using the Shell Python section in the Spark Programming Guide: https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell You can set the PYSPARK_PYTHON environment variable to choose the Python executable that will be used on the driver and executors. In addition, you can set PYSPARK_DRIVER_PYTHON to use a different Python executable only on the driver (this is useful if you want to use IPython on the driver but not on the executors). On Tue, Dec 30, 2014 at 11:13 AM, JAGANADH G jagana...@gmail.com wrote: Hi I am using Aanonda Python. Is there any way to specify the Python which we have o use for running pyspark in a cluster. Best regards Jagan On Tue, Dec 30, 2014 at 6:27 PM, Eric Friedman eric.d.fried...@gmail.com wrote: The Python installed in your cluster is 2.5. You need at least 2.6. Eric Friedman On Dec 30, 2014, at 7:45 AM, Jaggu jagana...@gmail.com wrote: Hi Team, I was trying to execute a Pyspark code in cluster. It gives me the following error. (Wne I run the same job in local it is working fine too :-() Eoor Error from python worker: /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py:209: Warning: 'with' will become a reserved keyword in Python 2.6 Traceback (most recent call last): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/runpy.py, line 85, in run_module loader = get_loader(mod_name) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 456, in get_loader return find_loader(fullname) File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 466, in find_loader for importer in iter_importers(fullname): File /home/beehive/toolchain/x86_64-unknown-linux-gnu/python-2.5.2/lib/python2.5/pkgutil.py, line 422, in iter_importers __import__(pkg) File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/__init__.py, line 41, in module from pyspark.context import SparkContext File /usr/lib/spark-1.2.0-bin-hadoop2.3/python/pyspark/context.py, line 209 with SparkContext._lock: ^ SyntaxError: invalid syntax PYTHONPATH was: /usr/lib/spark-1.2.0-bin-hadoop2.3/python:/usr/lib/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark-1.2.0-bin-hadoop2.3/sbin/../python:/home/beehive/bin/utils/primitives:/home/beehive/bin/utils/pylogger:/home/beehive/bin/utils/asterScript:/home/beehive/bin/lib:/home/beehive/bin/utils/init:/home/beehive/installer/packages:/home/beehive/ncli java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 14/12/31 04:49:58 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, aster4, NODE_LOCAL, 1321 bytes) 14/12/31 04:49:58 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on aster4:43309 (size: 3.8 KB, free: 265.0 MB) 14/12/31 04:49:59 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor aster4: org.apache.spark.SparkException ( Any clue how to resolve the same. Best regards Jagan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-with-error-from-PySpark-tp20907.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http