Am 31.12.2012 um 09:34 schrieb Semi:

> The memory is not a problem, the problem CPU load,
> every python process runs 2 other processes and this stuck nodes.
> For examle: 16 CPU nodes run 48 python processes.

The question was: as you suggested to your user to request "-l mem_free=4G" it 
implies that the available memory will be taken into account when starting 
additional processes, but this seems not to happen.

You will have to investigate what is running on the machine in question with:

$ ps -e f

(f w/o -). It might be the case that there are additional processes in 
uninterruptible kernel sleep (state "D") which increase the load too.

If all load is coming from his Python script, it needs to be adjusted to run 
only serial and not taking all available cores in the machine.

-- Reuti


> On 30-Dec-12 20:47, Reuti wrote:
>> Am 29.12.2012 um 07:56 schrieb Semi:
>> 
>>> User running python, load 16 CPU nodes to uptime 64-66.
>>> I suggest him use qsub -l mem_free=4G, it doesn't help.
>> How is the memory available right now in the machine related to the number 
>> of processes your application starts?
>> 
>> -- Reuti
>> 
>> 
>>> I would not like to change general grid configuration,
>>> how user can make it personally using qsub?
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to