Re: [gridengine users] double seizure of processors

Dave Love Wed, 11 Apr 2012 07:12:09 -0700

Ursula Winkler <[email protected]> writes:

> Ursula Winkler wrote:
>> Reuti wrote:
>>   
>>> Am 11.04.2012 um 11:15 schrieb Ursula Winkler:
>>>
>>>       
>>>> Reuti wrote:
>>>>           
>>>>> This could also be a problem of the MPI implementations. Which one do you 
>>>>> use - you use a plain mpiexec?
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>                
>>>> I found that setting "MV2_ENABLE_AFFINITY=0" could be an option. But this 
>>>> should be set per default. Is this right?
>>>>           
>>> I wouldn't bet on it. I found some sites where they suggest to set it to 
>>> zero.
>>>
>>> http://www.osc.edu/supercomputing/faq.shtml
>>>
>>> -- Reuti
>>>       
>>
>> Thanks. I told my users they should try it out. I hope it helps.
>>   
>
> Unfortunately it did not help. So, any ideas?


I don't know what's happening here in detail, but I can explain
generally if it's not documented for mvapich.

First of all, core binding is important for performance, particularly on
NUMA systems, and you should _not_ leave it to the operating system.  It
sounds as if that's not what's happening here though, and mvapich has
just done the binding badly.

What should happen for nodes which run multiple jobs is that gridengine
should bind specific cores (see -binding for qsub, e.g.
http://arc.liv.ac.uk/SGE/htmlman/htmlman1/submit.html for up-to-date
doc).  As far as I know, you need SGE from the site in my sig to get the
behaviour where you can have different numbers of cores bound on
different hosts ("linear:slots") if that matters.  Also you need that
one, or another version based on the hwloc library, for binding to work
properly on recent hardware or non-Linux kernels.

The gridengine binding (which gridengine keeps track of) separates jobs,
and it should be noticed by the MPI, which should then bind the
individual processes to the cores it's been given.  I don't know
mvapich, but I know it uses hwloc, and should be able to do this
properly like openmpi does (modulo issues with recent hardware, sigh).
I thought mvapich would do the right thing automatically -- openmpi is
said often to look bad performance-wise by not doing core binding by
default.

If your MPI jobs have exclusive access to the nodes it's simpler as the
MPI system can do the binding itself without worrying about what else is
running (e.g. the old paffinity_alone setting in openmpi).

-- 
Community Grid Engine: http://arc.liv.ac.uk/SGE/
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] double seizure of processors

Reply via email to