Reuti <[email protected]> writes:

> Am 27.02.2013 um 20:56 schrieb Mikael Brandström Durling:
>> <snip>
>>> 
>>> In case you look deeper into the issue, it's also worth to note that there 
>>> is no option to specify the target queue for `qrsh -inherit` in case you 
>>> get slots from different queues on the slave system:
>>> 
>>> https://arc.liv.ac.uk/trac/SGE/ticket/813
>> 
>> Ok. This could lead to incompatible changes to the -inherit behaviour, if 
>> the caller to `qrsh -inherit` has to specify the queue requested. On the 
>> other hand, I have seen cases where an OMPI job has been allotted slots from 
>> two different queues on an exec host, which has resulted in ompi launching 
>> two `qrsh -inherit` to the same host.

In my limited experience, you really don't want to split parallel jobs
across queues (and you only add queues if there's something you have to
hang off them).

I don't really understand what the complaint is here otherwise.  OMPI
with h_vmem enforced works reasonably well for us (with a single queue).

> This was a bug and is fixed in the meantime from Open MPI 1.5.5 on.
>
> https://svn.open-mpi.org/trac/ompi/changeset/26163
>
> It will always add up all slots for a machine even if they come from 
> different queues now.

You'll still get potential confusion from different TMPDIRs, though.  I
never established whether there was any problem replacing the queue name
with the cell name in TMPDIR construction, but I have a patch lying
around to do it.

>> I'll think of this and add it as a comment to the ticket. Is that
>> trac instance at arc.liv.ac.uk the best place, even though we are
>> running OGS? I suppose so?

I'd be happy to have reports that might improve SGE (if I or someone
else understands the issue), but I'm afraid I've been flamed for trying
to help OGS users.

-- 
Community Grid Engine:  http://arc.liv.ac.uk/SGE/

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to