Kind of strange because we haven’t updated CloudPickle AFAIK. Is this a package 
you added on the PYTHONPATH? How did you set the path, was it in 
conf/spark-env.sh?

Matei

On Apr 10, 2014, at 7:39 AM, aazout <albert.az...@velos.io> wrote:

> I am getting a python ImportError on Spark standalone cluster. I have set the
> PYTHONPATH on both worker and slave and the package imports properly when I
> run PySpark command line on both machines. This only happens with Master -
> Slave communication. Here is the error below: 
> 
> 14/04/10 13:40:19 INFO scheduler.TaskSetManager: Loss was due to
> org.apache.spark.api.python.PythonException: Traceback (most recent call
> last): 
>  File "/root/spark/python/pyspark/worker.py", line 73, in main 
>    command = pickleSer._read_with_length(infile) 
>  File "/root/spark/python/pyspark/serializers.py", line 137, in
> _read_with_length 
>    return self.loads(obj) 
>  File "/root/spark/python/pyspark/cloudpickle.py", line 810, in subimport 
>    __import__(name) 
> ImportError: ('No module named volatility.atm_impl_vol', <function subimport
> at 0xa36050>, ('volatility.atm_impl_vol',)) 
> 
> Any ideas?
> 
> 
> 
> -----
> CEO / Velos (velos.io)
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-0-9-1-PySpark-ImportError-tp4068.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to