Try to figure out what the env vars and arguments of the worker JVM and
Python process are. Maybe you'll get a clue.

On Mon, Jul 4, 2016 at 11:42 AM Mathieu Longtin <math...@closetwork.org>
wrote:

> I started with a download of 1.6.0. These days, we use a self compiled
> 1.6.2.
>
> On Mon, Jul 4, 2016 at 11:39 AM Ashwin Raaghav <ashraag...@gmail.com>
> wrote:
>
>> I am thinking of any possibilities as to why this could be happening. If
>> the cores are multi-threaded, should that affect the daemons? Your spark
>> was built from source code or downloaded as a binary, though that should
>> not technically change anything?
>>
>> On Mon, Jul 4, 2016 at 9:03 PM, Mathieu Longtin <math...@closetwork.org>
>> wrote:
>>
>>> 1.6.1.
>>>
>>> I have no idea. SPARK_WORKER_CORES should do the same.
>>>
>>> On Mon, Jul 4, 2016 at 11:24 AM Ashwin Raaghav <ashraag...@gmail.com>
>>> wrote:
>>>
>>>> Which version of Spark are you using? 1.6.1?
>>>>
>>>> Any ideas as to why it is not working in ours?
>>>>
>>>> On Mon, Jul 4, 2016 at 8:51 PM, Mathieu Longtin <math...@closetwork.org
>>>> > wrote:
>>>>
>>>>> 16.
>>>>>
>>>>> On Mon, Jul 4, 2016 at 11:16 AM Ashwin Raaghav <ashraag...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I tried what you suggested and started the slave using the following
>>>>>> command:
>>>>>>
>>>>>> start-slave.sh --cores 1 <master>
>>>>>>
>>>>>> But it still seems to start as many pyspark daemons as the number of
>>>>>> cores in the node (1 parent and 3 workers). Limiting it via spark-env.sh
>>>>>> file by giving SPARK_WORKER_CORES=1 also didn't help.
>>>>>>
>>>>>> When you said it helped you and limited it to 2 processes in your
>>>>>> cluster, how many cores did each machine have?
>>>>>>
>>>>>> On Mon, Jul 4, 2016 at 8:22 PM, Mathieu Longtin <
>>>>>> math...@closetwork.org> wrote:
>>>>>>
>>>>>>> It depends on what you want to do:
>>>>>>>
>>>>>>> If, on any given server, you don't want Spark to use more than one
>>>>>>> core, use this to start the workers: SPARK_HOME/sbin/start-slave.sh
>>>>>>> --cores=1
>>>>>>>
>>>>>>> If you have a bunch of servers dedicated to Spark, but you don't
>>>>>>> want a driver to use more than one core per server, then: 
>>>>>>> spark.executor.cores=1
>>>>>>> tells it not to use more than 1 core per server. However, it seems it 
>>>>>>> will
>>>>>>> start as many pyspark as there are cores, but maybe not use them.
>>>>>>>
>>>>>>> On Mon, Jul 4, 2016 at 10:44 AM Ashwin Raaghav <ashraag...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Mathieu,
>>>>>>>>
>>>>>>>> Isn't that the same as setting "spark.executor.cores" to 1? And how
>>>>>>>> can I specify "--cores=1" from the application?
>>>>>>>>
>>>>>>>> On Mon, Jul 4, 2016 at 8:06 PM, Mathieu Longtin <
>>>>>>>> math...@closetwork.org> wrote:
>>>>>>>>
>>>>>>>>> When running the executor, put --cores=1. We use this and I only
>>>>>>>>> see 2 pyspark process, one seem to be the parent of the other and is 
>>>>>>>>> idle.
>>>>>>>>>
>>>>>>>>> In your case, are all pyspark process working?
>>>>>>>>>
>>>>>>>>> On Mon, Jul 4, 2016 at 3:15 AM ar7 <ashraag...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am currently using PySpark 1.6.1 in my cluster. When a pyspark
>>>>>>>>>> application
>>>>>>>>>> is run, the load on the workers seems to go more than what was
>>>>>>>>>> given. When I
>>>>>>>>>> ran top, I noticed that there were too many Pyspark.daemons
>>>>>>>>>> processes
>>>>>>>>>> running. There was another mail thread regarding the same:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://mail-archives.apache.org/mod_mbox/spark-user/201606.mbox/%3ccao429hvi3drc-ojemue3x4q1vdzt61htbyeacagtre9yrhs...@mail.gmail.com%3E
>>>>>>>>>>
>>>>>>>>>> I followed what was mentioned there, i.e. reduced the number of
>>>>>>>>>> executor
>>>>>>>>>> cores and number of executors in one node to 1. But the number of
>>>>>>>>>> pyspark.daemons process is still not coming down. It looks like
>>>>>>>>>> initially
>>>>>>>>>> there is one Pyspark.daemons process and this in turn spawns as
>>>>>>>>>> many
>>>>>>>>>> pyspark.daemons processes as the number of cores in the machine.
>>>>>>>>>>
>>>>>>>>>> Any help is appreciated :)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Ashwin Raaghav.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> View this message in context:
>>>>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Limiting-Pyspark-daemons-tp27272.html
>>>>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>>>>> Nabble.com.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> Mathieu Longtin
>>>>>>>>> 1-514-803-8977
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Ashwin Raaghav
>>>>>>>>
>>>>>>> --
>>>>>>> Mathieu Longtin
>>>>>>> 1-514-803-8977
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>>
>>>>>> Ashwin Raaghav
>>>>>>
>>>>> --
>>>>> Mathieu Longtin
>>>>> 1-514-803-8977
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> Ashwin Raaghav
>>>>
>>> --
>>> Mathieu Longtin
>>> 1-514-803-8977
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Ashwin Raaghav
>>
> --
> Mathieu Longtin
> 1-514-803-8977
>
-- 
Mathieu Longtin
1-514-803-8977

Reply via email to