Re: [OMPI users] restricting a job to a set of hosts

2012-07-28 Thread Erik Nelson
Reuti,

>-nolocal is IMO an option where you want to execute the `mpirun` on your
local login machine and want the MPI >processes to be allocated somewhere
in the cluster, in case you don't have any queuing system around to manage
>the resources.

yes, this is exactly my understanding of the -nolocal option. Otherwise, by
specifying an 'image set' of processors,
everything gets 'mapped' to some subset of processors in the image set.
Again, thanks for your response.


On Fri, Jul 27, 2012 at 5:15 AM, Reuti  wrote:

> Am 27.07.2012 um 03:21 schrieb Ralph Castain:
>
> > Application processes will *only* be placed on nodes included in the
> allocation. The -nolocal flag is intended to ensure that no application
> processes are started on the same node as mpirun in the case where that
> node is included in the allocation. This happens, for example, with Torque,
> where mpirun is executed on one of the allocated nodes.
>
> But the behavior is the same in Torque and SGE. The jobscript is executed
> on one of the elected exechosts (neither the submit host, nor the qmaster
> host [unless they are exechosts too]) and so eligible to be used too. In no
> case there should be -nolocal being used.
>
> -nolocal is IMO an option where you want to execute the `mpirun` on your
> local login machine and want the MPI processes to be allocated somewhere in
> the cluster, in case you don't have any queuing system around to manage the
> resources.
>
> -- Reuti
>
> > I believe SGE doesn't do that - and so the allocation won't include the
> submit host, in which case you don't need -nolocal.
> >
> >
> > On Jul 26, 2012, at 5:58 PM, Erik Nelson wrote:
> >
> >> I was under the impression that the -nolocal option keeps processes off
> the submit
> >> host (since there may be hundreds or thousands of jobs submitted at any
> time,
> >> and we don't want this host to be overloaded).
> >>
> >> My understanding of what you said in you last email is that, by listing
> the hosts,  I
> >> automatically send all processes (parent and child, or master and slave
> if you
> >> prefer) to the specified list of hosts.
> >>
> >> Reading your email below, it looks like this was the correct
> understanding.
> >>
> >>
> >> On Thu, Jul 26, 2012 at 5:20 PM, Reuti 
> wrote:
> >> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
> >>
> >> > Reuti,
> >> >
> >> > Thank you. Our queue is backed up, so it will take a little while
> before I can try this.
> >> >
> >> > I assume that by specifying the nodes this way, I don't need (and it
> would confuse
> >> > the system) to add -nolocal. In other words, qsub will try to put the
> parent node
> >> > somewhere in this set.
> >> >
> >> > Is this the idea?
> >>
> >> Depends what you refer to by "parent node". I assume you mean the
> submit host. This is never included in any created selection of SGE unless
> it's an execution host too.
> >>
> >> The master host of the parallel job (i.e. the one where the jobscript
> with the `mpiexec` is running) will be used as a normal machine from MPI's
> point of view.
> >>
> >> -- Reuti
> >>
> >>
> >> > Erik
> >> >
> >> >
> >> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti 
> wrote:
> >> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> >> >
> >> > > I have a purely parallel job that runs ~100 processes. Each process
> has ~identical
> >> > > overhead so the speed of the program is dominated by the slowest
> processor.
> >> > >
> >> > > For this reason, I would like to restrict the job to a specific set
> of identical (fast)
> >> > > processors on our cluster.
> >> > >
> >> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to
> me what affect these
> >> > > directives will have in a queuing environment.
> >> > >
> >> > > Currently, I submit the job using the "qsub" command in the "sge"
> environment as :
> >> > >
> >> > > qsub -pe mpich 101 jobfile.job
> >> > >
> >> > > where jobfile contains the command
> >> > >
> >> > > mpirun -np 101 -nolocal ./executable
> >> >
> >> > I would leave -nolocal out here.
> >> >
> >> > $ qsub -l
> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe
> mpich 101 jobfile.job
> >> >
> >> > -- Reuti
> >> >
> >> >
> >> > > I would like to restrict the job to nodes compute-5-1 to
> compute-5-32 on our machine,
> >> > > each containing 8 cpu's (slots). How do I go about this?
> >> > >
> >> > > Thanks, Erik
> >> > >
> >> > > --
> >> > > Erik Nelson
> >> > >
> >> > > Howard Hughes Medical Institute
> >> > > 6001 Forest Park Blvd., Room ND10.124
> >> > > Dallas, Texas 75235-9050
> >> > >
> >> > > p : 214 645 5981
> >> > > f : 214 645 5948
> >> > > ___
> >> > > users mailing list
> >> > > us...@open-mpi.org
> >> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> >
> >> >
> >> > ___
> >> > users mailing list
> >> > 

Re: [OMPI users] restricting a job to a set of hosts

2012-07-27 Thread Reuti
Am 27.07.2012 um 03:21 schrieb Ralph Castain:

> Application processes will *only* be placed on nodes included in the 
> allocation. The -nolocal flag is intended to ensure that no application 
> processes are started on the same node as mpirun in the case where that node 
> is included in the allocation. This happens, for example, with Torque, where 
> mpirun is executed on one of the allocated nodes.

But the behavior is the same in Torque and SGE. The jobscript is executed on 
one of the elected exechosts (neither the submit host, nor the qmaster host 
[unless they are exechosts too]) and so eligible to be used too. In no case 
there should be -nolocal being used.

-nolocal is IMO an option where you want to execute the `mpirun` on your local 
login machine and want the MPI processes to be allocated somewhere in the 
cluster, in case you don't have any queuing system around to manage the 
resources.

-- Reuti

> I believe SGE doesn't do that - and so the allocation won't include the 
> submit host, in which case you don't need -nolocal.
> 
> 
> On Jul 26, 2012, at 5:58 PM, Erik Nelson wrote:
> 
>> I was under the impression that the -nolocal option keeps processes off the 
>> submit
>> host (since there may be hundreds or thousands of jobs submitted at any 
>> time, 
>> and we don't want this host to be overloaded).
>> 
>> My understanding of what you said in you last email is that, by listing the 
>> hosts,  I
>> automatically send all processes (parent and child, or master and slave if 
>> you 
>> prefer) to the specified list of hosts. 
>> 
>> Reading your email below, it looks like this was the correct understanding.
>> 
>> 
>> On Thu, Jul 26, 2012 at 5:20 PM, Reuti  wrote:
>> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
>> 
>> > Reuti,
>> >
>> > Thank you. Our queue is backed up, so it will take a little while before I 
>> > can try this.
>> >
>> > I assume that by specifying the nodes this way, I don't need (and it would 
>> > confuse
>> > the system) to add -nolocal. In other words, qsub will try to put the 
>> > parent node
>> > somewhere in this set.
>> >
>> > Is this the idea?
>> 
>> Depends what you refer to by "parent node". I assume you mean the submit 
>> host. This is never included in any created selection of SGE unless it's an 
>> execution host too.
>> 
>> The master host of the parallel job (i.e. the one where the jobscript with 
>> the `mpiexec` is running) will be used as a normal machine from MPI's point 
>> of view.
>> 
>> -- Reuti
>> 
>> 
>> > Erik
>> >
>> >
>> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti  wrote:
>> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
>> >
>> > > I have a purely parallel job that runs ~100 processes. Each process has 
>> > > ~identical
>> > > overhead so the speed of the program is dominated by the slowest 
>> > > processor.
>> > >
>> > > For this reason, I would like to restrict the job to a specific set of 
>> > > identical (fast)
>> > > processors on our cluster.
>> > >
>> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me 
>> > > what affect these
>> > > directives will have in a queuing environment.
>> > >
>> > > Currently, I submit the job using the "qsub" command in the "sge" 
>> > > environment as :
>> > >
>> > > qsub -pe mpich 101 jobfile.job
>> > >
>> > > where jobfile contains the command
>> > >
>> > > mpirun -np 101 -nolocal ./executable
>> >
>> > I would leave -nolocal out here.
>> >
>> > $ qsub -l 
>> > "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe 
>> > mpich 101 jobfile.job
>> >
>> > -- Reuti
>> >
>> >
>> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32 on 
>> > > our machine,
>> > > each containing 8 cpu's (slots). How do I go about this?
>> > >
>> > > Thanks, Erik
>> > >
>> > > --
>> > > Erik Nelson
>> > >
>> > > Howard Hughes Medical Institute
>> > > 6001 Forest Park Blvd., Room ND10.124
>> > > Dallas, Texas 75235-9050
>> > >
>> > > p : 214 645 5981
>> > > f : 214 645 5948
>> > > ___
>> > > users mailing list
>> > > us...@open-mpi.org
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> > --
>> > Erik Nelson
>> >
>> > Howard Hughes Medical Institute
>> > 6001 Forest Park Blvd., Room ND10.124
>> > Dallas, Texas 75235-9050
>> >
>> > p : 214 645 5981
>> > f : 214 645 5948
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> -- 
>> Erik Nelson
>> 
>> Howard Hughes Medical Institute
>> 

Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Erik Nelson
I see. Thank you both for the prompt replies.

On Thu, Jul 26, 2012 at 8:21 PM, Ralph Castain  wrote:

> Application processes will *only* be placed on nodes included in the
> allocation. The -nolocal flag is intended to ensure that no application
> processes are started on the same node as mpirun in the case where that
> node is included in the allocation. This happens, for example, with Torque,
> where mpirun is executed on one of the allocated nodes.
>
> I believe SGE doesn't do that - and so the allocation won't include the
> submit host, in which case you don't need -nolocal.
>
>
> On Jul 26, 2012, at 5:58 PM, Erik Nelson wrote:
>
> I was under the impression that the -nolocal option keeps processes off
> the submit
> host (since there may be hundreds or thousands of jobs submitted at any
> time,
> and we don't want this host to be overloaded).
>
> My understanding of what you said in you last email is that, by listing
> the hosts,  I
> automatically send all processes (parent and child, or master and slave if
> you
> prefer) to the specified list of hosts.
>
> Reading your email below, it looks like this was the correct understanding.
>
>
> On Thu, Jul 26, 2012 at 5:20 PM, Reuti  wrote:
>
>> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
>>
>> > Reuti,
>> >
>> > Thank you. Our queue is backed up, so it will take a little while
>> before I can try this.
>> >
>> > I assume that by specifying the nodes this way, I don't need (and it
>> would confuse
>> > the system) to add -nolocal. In other words, qsub will try to put the
>> parent node
>> > somewhere in this set.
>> >
>> > Is this the idea?
>>
>> Depends what you refer to by "parent node". I assume you mean the submit
>> host. This is never included in any created selection of SGE unless it's an
>> execution host too.
>>
>> The master host of the parallel job (i.e. the one where the jobscript
>> with the `mpiexec` is running) will be used as a normal machine from MPI's
>> point of view.
>>
>> -- Reuti
>>
>>
>> > Erik
>> >
>> >
>> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti 
>> wrote:
>> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
>> >
>> > > I have a purely parallel job that runs ~100 processes. Each process
>> has ~identical
>> > > overhead so the speed of the program is dominated by the slowest
>> processor.
>> > >
>> > > For this reason, I would like to restrict the job to a specific set
>> of identical (fast)
>> > > processors on our cluster.
>> > >
>> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me
>> what affect these
>> > > directives will have in a queuing environment.
>> > >
>> > > Currently, I submit the job using the "qsub" command in the "sge"
>> environment as :
>> > >
>> > > qsub -pe mpich 101 jobfile.job
>> > >
>> > > where jobfile contains the command
>> > >
>> > > mpirun -np 101 -nolocal ./executable
>> >
>> > I would leave -nolocal out here.
>> >
>> > $ qsub -l
>> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe
>> mpich 101 jobfile.job
>> >
>> > -- Reuti
>> >
>> >
>> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32
>> on our machine,
>> > > each containing 8 cpu's (slots). How do I go about this?
>> > >
>> > > Thanks, Erik
>> > >
>> > > --
>> > > Erik Nelson
>> > >
>> > > Howard Hughes Medical Institute
>> > > 6001 Forest Park Blvd., Room ND10.124
>> > > Dallas, Texas 75235-9050
>> > >
>> > > p : 214 645 5981
>> > > f : 214 645 5948
>> > > ___
>> > > users mailing list
>> > > us...@open-mpi.org
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> > --
>> > Erik Nelson
>> >
>> > Howard Hughes Medical Institute
>> > 6001 Forest Park Blvd., Room ND10.124
>> > Dallas, Texas 75235-9050
>> >
>> > p : 214 645 5981
>> > f : 214 645 5948
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Erik Nelson
>
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
>
> p : 214 645 5981
> f : 214 645 5948
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 

Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Ralph Castain
Application processes will *only* be placed on nodes included in the 
allocation. The -nolocal flag is intended to ensure that no application 
processes are started on the same node as mpirun in the case where that node is 
included in the allocation. This happens, for example, with Torque, where 
mpirun is executed on one of the allocated nodes.

I believe SGE doesn't do that - and so the allocation won't include the submit 
host, in which case you don't need -nolocal.


On Jul 26, 2012, at 5:58 PM, Erik Nelson wrote:

> I was under the impression that the -nolocal option keeps processes off the 
> submit
> host (since there may be hundreds or thousands of jobs submitted at any time, 
> and we don't want this host to be overloaded).
> 
> My understanding of what you said in you last email is that, by listing the 
> hosts,  I
> automatically send all processes (parent and child, or master and slave if 
> you 
> prefer) to the specified list of hosts. 
> 
> Reading your email below, it looks like this was the correct understanding.
> 
> 
> On Thu, Jul 26, 2012 at 5:20 PM, Reuti  wrote:
> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
> 
> > Reuti,
> >
> > Thank you. Our queue is backed up, so it will take a little while before I 
> > can try this.
> >
> > I assume that by specifying the nodes this way, I don't need (and it would 
> > confuse
> > the system) to add -nolocal. In other words, qsub will try to put the 
> > parent node
> > somewhere in this set.
> >
> > Is this the idea?
> 
> Depends what you refer to by "parent node". I assume you mean the submit 
> host. This is never included in any created selection of SGE unless it's an 
> execution host too.
> 
> The master host of the parallel job (i.e. the one where the jobscript with 
> the `mpiexec` is running) will be used as a normal machine from MPI's point 
> of view.
> 
> -- Reuti
> 
> 
> > Erik
> >
> >
> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti  wrote:
> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> >
> > > I have a purely parallel job that runs ~100 processes. Each process has 
> > > ~identical
> > > overhead so the speed of the program is dominated by the slowest 
> > > processor.
> > >
> > > For this reason, I would like to restrict the job to a specific set of 
> > > identical (fast)
> > > processors on our cluster.
> > >
> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me 
> > > what affect these
> > > directives will have in a queuing environment.
> > >
> > > Currently, I submit the job using the "qsub" command in the "sge" 
> > > environment as :
> > >
> > > qsub -pe mpich 101 jobfile.job
> > >
> > > where jobfile contains the command
> > >
> > > mpirun -np 101 -nolocal ./executable
> >
> > I would leave -nolocal out here.
> >
> > $ qsub -l 
> > "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe 
> > mpich 101 jobfile.job
> >
> > -- Reuti
> >
> >
> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32 on 
> > > our machine,
> > > each containing 8 cpu's (slots). How do I go about this?
> > >
> > > Thanks, Erik
> > >
> > > --
> > > Erik Nelson
> > >
> > > Howard Hughes Medical Institute
> > > 6001 Forest Park Blvd., Room ND10.124
> > > Dallas, Texas 75235-9050
> > >
> > > p : 214 645 5981
> > > f : 214 645 5948
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > Erik Nelson
> >
> > Howard Hughes Medical Institute
> > 6001 Forest Park Blvd., Room ND10.124
> > Dallas, Texas 75235-9050
> >
> > p : 214 645 5981
> > f : 214 645 5948
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Erik Nelson
> 
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
> 
> p : 214 645 5981
> f : 214 645 5948
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Erik Nelson
I was under the impression that the -nolocal option keeps processes off the
submit
host (since there may be hundreds or thousands of jobs submitted at any
time,
and we don't want this host to be overloaded).

My understanding of what you said in you last email is that, by listing the
hosts,  I
automatically send all processes (parent and child, or master and slave if
you
prefer) to the specified list of hosts.

Reading your email below, it looks like this was the correct understanding.


On Thu, Jul 26, 2012 at 5:20 PM, Reuti  wrote:

> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
>
> > Reuti,
> >
> > Thank you. Our queue is backed up, so it will take a little while before
> I can try this.
> >
> > I assume that by specifying the nodes this way, I don't need (and it
> would confuse
> > the system) to add -nolocal. In other words, qsub will try to put the
> parent node
> > somewhere in this set.
> >
> > Is this the idea?
>
> Depends what you refer to by "parent node". I assume you mean the submit
> host. This is never included in any created selection of SGE unless it's an
> execution host too.
>
> The master host of the parallel job (i.e. the one where the jobscript with
> the `mpiexec` is running) will be used as a normal machine from MPI's point
> of view.
>
> -- Reuti
>
>
> > Erik
> >
> >
> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti 
> wrote:
> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> >
> > > I have a purely parallel job that runs ~100 processes. Each process
> has ~identical
> > > overhead so the speed of the program is dominated by the slowest
> processor.
> > >
> > > For this reason, I would like to restrict the job to a specific set of
> identical (fast)
> > > processors on our cluster.
> > >
> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me
> what affect these
> > > directives will have in a queuing environment.
> > >
> > > Currently, I submit the job using the "qsub" command in the "sge"
> environment as :
> > >
> > > qsub -pe mpich 101 jobfile.job
> > >
> > > where jobfile contains the command
> > >
> > > mpirun -np 101 -nolocal ./executable
> >
> > I would leave -nolocal out here.
> >
> > $ qsub -l
> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe
> mpich 101 jobfile.job
> >
> > -- Reuti
> >
> >
> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32
> on our machine,
> > > each containing 8 cpu's (slots). How do I go about this?
> > >
> > > Thanks, Erik
> > >
> > > --
> > > Erik Nelson
> > >
> > > Howard Hughes Medical Institute
> > > 6001 Forest Park Blvd., Room ND10.124
> > > Dallas, Texas 75235-9050
> > >
> > > p : 214 645 5981
> > > f : 214 645 5948
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > Erik Nelson
> >
> > Howard Hughes Medical Institute
> > 6001 Forest Park Blvd., Room ND10.124
> > Dallas, Texas 75235-9050
> >
> > p : 214 645 5981
> > f : 214 645 5948
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 645 5948


Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Reuti
Am 26.07.2012 um 23:58 schrieb Erik Nelson:

> Reuti,
> 
> Thank you. Our queue is backed up, so it will take a little while before I 
> can try this. 
> 
> I assume that by specifying the nodes this way, I don't need (and it would 
> confuse 
> the system) to add -nolocal. In other words, qsub will try to put the parent 
> node 
> somewhere in this set. 
> 
> Is this the idea?

Depends what you refer to by "parent node". I assume you mean the submit host. 
This is never included in any created selection of SGE unless it's an execution 
host too.

The master host of the parallel job (i.e. the one where the jobscript with the 
`mpiexec` is running) will be used as a normal machine from MPI's point of view.

-- Reuti


> Erik
> 
> 
> On Thu, Jul 26, 2012 at 4:48 PM, Reuti  wrote:
> Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> 
> > I have a purely parallel job that runs ~100 processes. Each process has 
> > ~identical
> > overhead so the speed of the program is dominated by the slowest processor.
> >
> > For this reason, I would like to restrict the job to a specific set of 
> > identical (fast)
> > processors on our cluster.
> >
> > I read the FAQ on -hosts and -hostfile, but it is still unclear to me what 
> > affect these
> > directives will have in a queuing environment.
> >
> > Currently, I submit the job using the "qsub" command in the "sge" 
> > environment as :
> >
> > qsub -pe mpich 101 jobfile.job
> >
> > where jobfile contains the command
> >
> > mpirun -np 101 -nolocal ./executable
> 
> I would leave -nolocal out here.
> 
> $ qsub -l 
> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe 
> mpich 101 jobfile.job
> 
> -- Reuti
> 
> 
> > I would like to restrict the job to nodes compute-5-1 to compute-5-32 on 
> > our machine,
> > each containing 8 cpu's (slots). How do I go about this?
> >
> > Thanks, Erik
> >
> > --
> > Erik Nelson
> >
> > Howard Hughes Medical Institute
> > 6001 Forest Park Blvd., Room ND10.124
> > Dallas, Texas 75235-9050
> >
> > p : 214 645 5981
> > f : 214 645 5948
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Erik Nelson
> 
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
> 
> p : 214 645 5981
> f : 214 645 5948
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Reuti
Am 26.07.2012 um 23:48 schrieb Reuti:

> Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> 
>> I have a purely parallel job that runs ~100 processes. Each process has 
>> ~identical 
>> overhead so the speed of the program is dominated by the slowest processor.
>> 
>> For this reason, I would like to restrict the job to a specific set of 
>> identical (fast)
>> processors on our cluster.
>> 
>> I read the FAQ on -hosts and -hostfile, but it is still unclear to me what 
>> affect these 
>> directives will have in a queuing environment.
>> 
>> Currently, I submit the job using the "qsub" command in the "sge" 
>> environment as :
>> 
>>qsub -pe mpich 101 jobfile.job
>> 
>> where jobfile contains the command
>> 
>>mpirun -np 101 -nolocal ./executable
> 
> I would leave -nolocal out here.
> 
> $ qsub -l 
> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe 
> mpich 101 jobfile.job

Or shorter:

$ qsub -l "h=compute-5*&(*-[1-9]|*-[1-2][0-9]|*-3[0-2])" -pe mpich 101 
jobfile.job

-- Reuti


> -- Reuti
> 
> 
>> I would like to restrict the job to nodes compute-5-1 to compute-5-32 on our 
>> machine, 
>> each containing 8 cpu's (slots). How do I go about this?
>> 
>> Thanks, Erik
>> 
>> -- 
>> Erik Nelson
>> 
>> Howard Hughes Medical Institute
>> 6001 Forest Park Blvd., Room ND10.124
>> Dallas, Texas 75235-9050
>> 
>> p : 214 645 5981
>> f : 214 645 5948
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Reuti
Am 26.07.2012 um 23:33 schrieb Erik Nelson:

> I have a purely parallel job that runs ~100 processes. Each process has 
> ~identical 
> overhead so the speed of the program is dominated by the slowest processor.
>  
> For this reason, I would like to restrict the job to a specific set of 
> identical (fast)
> processors on our cluster.
> 
> I read the FAQ on -hosts and -hostfile, but it is still unclear to me what 
> affect these 
> directives will have in a queuing environment.
> 
> Currently, I submit the job using the "qsub" command in the "sge" environment 
> as :
> 
> qsub -pe mpich 101 jobfile.job
> 
> where jobfile contains the command
> 
> mpirun -np 101 -nolocal ./executable

I would leave -nolocal out here.

$ qsub -l 
"h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe 
mpich 101 jobfile.job

-- Reuti


> I would like to restrict the job to nodes compute-5-1 to compute-5-32 on our 
> machine, 
> each containing 8 cpu's (slots). How do I go about this?
> 
> Thanks, Erik
> 
> -- 
> Erik Nelson
> 
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
> 
> p : 214 645 5981
> f : 214 645 5948
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] restricting a job to a set of hosts

2012-07-26 Thread Erik Nelson
I have a purely parallel job that runs ~100 processes. Each process has
~identical
overhead so the speed of the program is dominated by the slowest processor.

For this reason, I would like to restrict the job to a specific set of
identical (fast)
processors on our cluster.

I read the FAQ on -hosts and -hostfile, but it is still unclear to me what
affect these
directives will have in a queuing environment.

Currently, I submit the job using the "qsub" command in the "sge"
environment as :

qsub -pe mpich 101 jobfile.job

where jobfile contains the command

mpirun -np 101 -nolocal ./executable

I would like to restrict the job to nodes compute-5-1 to compute-5-32 on
our machine,
each containing 8 cpu's (slots). How do I go about this?

Thanks, Erik

-- 
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 645 5948