Limiting Pyspark.daemons

2016-07-04 Thread ar7
Hi,

I am currently using PySpark 1.6.1 in my cluster. When a pyspark application
is run, the load on the workers seems to go more than what was given. When I
ran top, I noticed that there were too many Pyspark.daemons processes
running. There was another mail thread regarding the same:

https://mail-archives.apache.org/mod_mbox/spark-user/201606.mbox/%3ccao429hvi3drc-ojemue3x4q1vdzt61htbyeacagtre9yrhs...@mail.gmail.com%3E

I followed what was mentioned there, i.e. reduced the number of executor
cores and number of executors in one node to 1. But the number of
pyspark.daemons process is still not coming down. It looks like initially
there is one Pyspark.daemons process and this in turn spawns as many
pyspark.daemons processes as the number of cores in the machine. 

Any help is appreciated :)

Thanks,
Ashwin Raaghav.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Limiting-Pyspark-daemons-tp27272.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Adding h5 files in a zip to use with PySpark

2016-06-15 Thread ar7
I am using PySpark 1.6.1 for my spark application. I have additional modules
which I am loading using the argument --py-files. I also have a h5 file
which I need to access from one of the modules for initializing the
ApolloNet.

Is there any way I could access those files from the modules if I put them
in the same archive? I tried this approach but it was throwing an error
because the files are not there in every worker. I can think of one solution
which is copying the file to each of the workers but I want to know if there
are better ways to do it?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Adding-h5-files-in-a-zip-to-use-with-PySpark-tp27173.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org