Hi Marcelo, Quick Question.

I am using Spark 1.3 and using Yarn Client mode. It is working well,
provided I have to manually pip-install all the 3rd party libraries like
numpy etc to the executor nodes.



So the SPARK-5479 fix in 1.5 which you mentioned fix this as well?
Thanks.


On Thu, Jun 25, 2015 at 2:22 PM, Marcelo Vanzin <van...@cloudera.com> wrote:

> That sounds like SPARK-5479 which is not in 1.4...
>
> On Thu, Jun 25, 2015 at 12:17 PM, Elkhan Dadashov <elkhan8...@gmail.com>
> wrote:
>
>> In addition to previous emails, when i try to execute this command from
>> command line:
>>
>> ./bin/spark-submit --verbose --master yarn-cluster --py-files
>>  mypython/libs/numpy-1.9.2.zip --deploy-mode cluster
>> mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0
>>
>>
>> - numpy-1.9.2.zip - is downloaded numpy package
>> - kmeans.py is default example which comes with Spark 1.4
>> - kmeans_data.txt  - is default data file which comes with Spark 1.4
>>
>>
>> It fails saying that it could not find numpy:
>>
>> File "kmeans.py", line 31, in <module>
>>     import numpy
>> ImportError: No module named numpy
>>
>> Has anyone run Python Spark application on Yarn-cluster mode ? (which has
>> 3rd party Python modules to be shipped with)
>>
>> What are the configurations or installations to be done before running
>> Python Spark job with 3rd party dependencies on Yarn-cluster ?
>>
>> Thanks in advance.
>>
>>
> --
> Marcelo
>

Reply via email to