Am 18.02.2015 um 13:13 schrieb Kevin Taylor <[email protected]>: > > > I have several groups of machines that have infiniband on them and due to > history and physical locations, these groups of machines have individual > infiniband domains. > > What I've done right now (not in production) is create a boolean complex for > 'ib' and identify all of the nodes that contain infiniband. I've also created > a string complex called 'ibdomain' that has a name to uniquely identify which > systems connect to each other with IB. > > Is there a way that a user could just ask for 'ib' when submitting a parallel > job (I don't care where it goes as long as it has infiniband), and have the > grid engine tell the job the value of 'ibdomain'? Or keep the job within > systems on the same ibdomain?
It should work to request one of the domains by specifying its name as a request with a specific string `qsub -l ibdomain=section2 ...`. But this may not be what you are looking for as you can't use a wildcard here (at least not with the effect to stay inside one domain). Instead of a complex which receives a certain number, it's easier to define one PE per domain with a suffix and then request like `qsub -pe "ib*" 16 ...` for any of the IB domains. Once a PE is selected, only slots belonging to this PE will be selected for this job. You can attach the PEs in a single queue by defining a list of PEs for individual nodes or per hostgroup (which might shorten the line). $ qconf -sq all.q ... pe_list make,[@ibhosts1=ib1 make],[@ibhosts2=ib2 make smp] (the default is used only for machines not listed further more in the list, it's not added to all automatically) -- Reuti _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
