Got it. I just tried it out, and I think that might just do it. Thanks for your help.
> Subject: Re: [gridengine users] Creating an infiniband complex > From: [email protected] > Date: Wed, 18 Feb 2015 15:52:19 +0100 > CC: [email protected] > To: [email protected] > > Am 18.02.2015 um 15:35 schrieb Kevin Taylor <[email protected]>: > > > > > > I'm not sure if I'm getting the concept or not, but see if this is what you > > mean. > > > > I have several parallel environments already to handle different software > > application needs. One of the CFD environments uses hpmpi for it's > > processing (over IB), so we have an hpmpi PE set up for it. > > > > If I were that PE to something like hpmpi-ib1 and hpmpi-ib2, set up certain > > machines to use either hpmpi-ib1 or hpmpi-ib2, then when I submit my job: > > > > qsub -q blah -pe "hpmpi-ib*" job.sh > > > > it'll just pick one? > > > > Now that I re-read what I wrote, it's pretty much the same thing you and > > William are saying. Is there something else I forgot here? > > Besides the missing slot count in the above statement: no > > -- Reuti > > > > > Subject: Re: [gridengine users] Creating an infiniband complex > > > From: [email protected] > > > Date: Wed, 18 Feb 2015 15:06:17 +0100 > > > CC: [email protected] > > > To: [email protected] > > > > > > Am 18.02.2015 um 13:13 schrieb Kevin Taylor <[email protected]>: > > > > > > > > > > > > I have several groups of machines that have infiniband on them and due > > > > to history and physical locations, these groups of machines have > > > > individual infiniband domains. > > > > > > > > What I've done right now (not in production) is create a boolean > > > > complex for 'ib' and identify all of the nodes that contain infiniband. > > > > I've also created a string complex called 'ibdomain' that has a name to > > > > uniquely identify which systems connect to each other with IB. > > > > > > > > Is there a way that a user could just ask for 'ib' when submitting a > > > > parallel job (I don't care where it goes as long as it has infiniband), > > > > and have the grid engine tell the job the value of 'ibdomain'? Or keep > > > > the job within systems on the same ibdomain? > > > > > > It should work to request one of the domains by specifying its name as a > > > request with a specific string `qsub -l ibdomain=section2 ...`. But this > > > may not be what you are looking for as you can't use a wildcard here (at > > > least not with the effect to stay inside one domain). > > > > > > Instead of a complex which receives a certain number, it's easier to > > > define one PE per domain with a suffix and then request like `qsub -pe > > > "ib*" 16 ...` for any of the IB domains. Once a PE is selected, only > > > slots belonging to this PE will be selected for this job. > > > > > > You can attach the PEs in a single queue by defining a list of PEs for > > > individual nodes or per hostgroup (which might shorten the line). > > > > > > $ qconf -sq all.q > > > ... > > > pe_list make,[@ibhosts1=ib1 make],[@ibhosts2=ib2 make smp] > > > > > > (the default is used only for machines not listed further more in the > > > list, it's not added to all automatically) > > > > > > -- Reuti > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
