Am 05.06.2012 um 04:29 schrieb Rayson Ho: > On Mon, Jun 4, 2012 at 9:08 PM, Joseph Farran <[email protected]> wrote: >> Telling OGE to use the "mpich" Parallel Environment requesting 64 slots >> (cores). >> >> Is there a way of having a generic/default PE environment so one can simple >> say: >> >> #$ -pe 64 > > You can do something similar by defining a generic PE and you can then > use a generic name. > > Keep in mind that a PE is a somewhat overloaded interface, and it > defines a lot of things: > > http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_pe.html > > So if for example the mpich PE
In fact: Open MPI and MPICH2 can have the same PE nowadays. You can also use it for HP-MPI, but would need to convert the list of machines therein to a format HP-MPI understands (recent versions can accept one in the MPICH1 format). This way, you could even refomat the list of machines to many other formats, and create a couple of machine files in $TMPDIR. The user will just need to use the correct one for his applications, you could call the file "hpmpi_machines" or alike and so on. As $TMPDIR is removed by SGE after the job anyway, the bunch of created list of machines won't matter. Such a generic PE won't work if daemons need to be started in the start/stop_proc_args (like for PVM, LAM-MPI or MPICH2 version < 1.3). Nevertheless you could use some kind of flag to select any of the startup mechanisms. Drawback: if you touch the PE scripts and mess them up, all parallel applications will fail. Therefore I would (as daemons are no longer used in recent versions I say "would") prefer one for each kind of such a daemon startup. > and a Hadoop PE both are tight (ie. > OGS/GE can control the slave tasks, perform job control & accounting, > etc), then you can in theory use a common PE for both. And while we > are on this topic, if you want a tight PE, you will need to use qrsh > to invoke remote tasks. ..., unless it is used already by the parallel library by default like Open MPI and MPICH2 if they detect that they run inside OGE. > Also, a generic "threaded" PE can be used for OpenMP applications, > Intel TBB programs, & user-threaded applications as the PE definition > are likely to be very similar. > > > >> In our environment, there are *many* users with their own parallel type of >> programs and I like to have a generic PE being the default, but I don't know >> if you can get away with not specifying a PE name? > > I haven't tried it myself, but looks like you can use a JSV or a qsub > wrapper for it if you think you really don't want to specify a PE: > > http://gridscheduler.sourceforge.net/htmlman/htmlman1/jsv.html > > But I would propably tell my users that it is a required parameter and > thus tell them not to be lazy! Yes, but the requested PE by the user must exist already (even with a slot count of 0), as the `qsub` parameters will be verfied berfore *and* after the call to a JSV. (This could be a point of discussion: if you want to correct/change/adjust the resource requests in a JSV, OGE should verify the parameters only *after* the adjustment.) -- Reuti >> One other question. We have a cluster here running sge6.2 and it has a PE >> for mpich with "Allocation Rule" set to "Round Robin". Should this not be >> set to "Fill Up" to fill up all cores on the first node before continuing >> with the next next node? I am trying to understand if this was an >> oversight when it was setup or if there is a reason for this? > > It depends on the MPI application - some applications perform better > when some MPI tasks can talk to their peers via shared memory, while > other MPI applications are better off when they can use more memory > per node. So you may want to talk to the person who set up the cluster > why he did it that way. > > BTW, we also have an archive of scheduler related blog entries at: > http://wiki.gridengine.info/wiki/index.php/StephansBlog > > Rayson > > > >> >> Thanks, >> Joseph >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
