On 11/15/2010 11:08 AM, Chris Jewell wrote:
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying to have two jobs having
processes on the same node be able to bind there processes on different
resources. Like core 1 for the first job and core 2 and 3 for the 2nd job?

--td
That's exactly it.  Each MPI process needs to be bound to 1 processor in a way 
that reflects GE's slot allocation scheme.

I actually don't think that I got it.  So you give two cases:

Case 1:

$ qsub -pe mpi 8 -binding pe linear:1 myScript.com

and my pe_hostfile looks like:

exec6.cluster.stats.local 2batch.q@exec6.cluster.stats.local  0,1
exec1.cluster.stats.local 1batch.q@exec1.cluster.stats.local  0,1
exec7.cluster.stats.local 1batch.q@exec7.cluster.stats.local  0,1
exec5.cluster.stats.local 1batch.q@exec5.cluster.stats.local  0,1
exec4.cluster.stats.local 1batch.q@exec4.cluster.stats.local  0,1
exec3.cluster.stats.local 1batch.q@exec3.cluster.stats.local  0,1
exec2.cluster.stats.local 1batch.q@exec2.cluster.stats.local  0,1


Case 2:

Notice that, because I have specified the -binding pe linear:1, each execution 
node binds processes for the job_id to one core.  If I have -binding pe 
linear:2, I get:

exec6.cluster.stats.local 2batch.q@exec6.cluster.stats.local  0,1:0,2
exec1.cluster.stats.local 1batch.q@exec1.cluster.stats.local  0,1:0,2
exec7.cluster.stats.local 1batch.q@exec7.cluster.stats.local  0,1:0,2
exec4.cluster.stats.local 1batch.q@exec4.cluster.stats.local  0,1:0,2
exec3.cluster.stats.local 1batch.q@exec3.cluster.stats.local  0,1:0,2
exec2.cluster.stats.local 1batch.q@exec2.cluster.stats.local  0,1:0,2
exec5.cluster.stats.local 1batch.q@exec5.cluster.stats.local  0,1:0,2

Is your complaint really the fact that exec6 has been allocated two slots but there seems to only be one slot worth of resources allocated to it (ie in case one exec6 only has 1 core and case 2 it has two where maybe you'd expect 2 and 4 cores allocated respectively)?

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



Reply via email to