I was under the impression that the -nolocal option keeps processes off the
submit
host (since there may be hundreds or thousands of jobs submitted at any
time,
and we don't want this host to be overloaded).

My understanding of what you said in you last email is that, by listing the
hosts,  I
automatically send all processes (parent and child, or master and slave if
you
prefer) to the specified list of hosts.

Reading your email below, it looks like this was the correct understanding.


On Thu, Jul 26, 2012 at 5:20 PM, Reuti <re...@staff.uni-marburg.de> wrote:

> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
>
> > Reuti,
> >
> > Thank you. Our queue is backed up, so it will take a little while before
> I can try this.
> >
> > I assume that by specifying the nodes this way, I don't need (and it
> would confuse
> > the system) to add -nolocal. In other words, qsub will try to put the
> parent node
> > somewhere in this set.
> >
> > Is this the idea?
>
> Depends what you refer to by "parent node". I assume you mean the submit
> host. This is never included in any created selection of SGE unless it's an
> execution host too.
>
> The master host of the parallel job (i.e. the one where the jobscript with
> the `mpiexec` is running) will be used as a normal machine from MPI's point
> of view.
>
> -- Reuti
>
>
> > Erik
> >
> >
> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti <re...@staff.uni-marburg.de>
> wrote:
> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
> >
> > > I have a purely parallel job that runs ~100 processes. Each process
> has ~identical
> > > overhead so the speed of the program is dominated by the slowest
> processor.
> > >
> > > For this reason, I would like to restrict the job to a specific set of
> identical (fast)
> > > processors on our cluster.
> > >
> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me
> what affect these
> > > directives will have in a queuing environment.
> > >
> > > Currently, I submit the job using the "qsub" command in the "sge"
> environment as :
> > >
> > >             qsub -pe mpich 101 jobfile.job
> > >
> > > where jobfile contains the command
> > >
> > >             mpirun -np 101 -nolocal ./executable
> >
> > I would leave -nolocal out here.
> >
> > $ qsub -l
> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe
> mpich 101 jobfile.job
> >
> > -- Reuti
> >
> >
> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32
> on our machine,
> > > each containing 8 cpu's (slots). How do I go about this?
> > >
> > > Thanks, Erik
> > >
> > > --
> > > Erik Nelson
> > >
> > > Howard Hughes Medical Institute
> > > 6001 Forest Park Blvd., Room ND10.124
> > > Dallas, Texas 75235-9050
> > >
> > > p : 214 645 5981
> > > f : 214 645 5948
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > Erik Nelson
> >
> > Howard Hughes Medical Institute
> > 6001 Forest Park Blvd., Room ND10.124
> > Dallas, Texas 75235-9050
> >
> > p : 214 645 5981
> > f : 214 645 5948
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 645 5948

Reply via email to