On 11 October 2011 23:33, Gerald Ragghianti <[email protected]> wrote: > >> Like the OP mentioned, one could use a consumable complex for 6.1. If you >> add "complex_values network=16" to the queue, and "load_thresholds >> network=15" it will be pushed to alarm state automatically and you can avoid >> the load sensor. When you add a default consumption of 1, it works >> out-of-the-box (it's only subtracted if it's attached to a queue). >> >> I.e. the other queue for normal jobs don't have it attached, and you select >> the special multi-node queue by the requested PE. > Unfortunately, I think there are two problems with this suggestion. > > 1. If I set network=16, then only 16 processors out of 48 will be usable > by parallel jobs. > > 2. The use of a load threshold seems to prevent fill_up from working > correctly, so even if I have network=48 for the queue complex and > network=47 for the load threshold it will not use up all 48 slots before > moving on to the next host. This seems to be due to the alarm state > becoming active on the queues at inconsistent times during a single > scheduling iteration. This would also affect the use of a custom load > sensor, so I'm abandoning that idea. > > If we were to update to 6.2u5, what options would we then have? My suggestion of a queue with an exclusive resource should work with 6.2u5.
Assuming that each multi node job running on a node consumes 1 context then you should be able to generalise the solution by adding one such multi-node/exclusive queue per context plus a single queue for serial/single node PEs. > > -- > Gerald Ragghianti > > Office of Information Technology - High Performance Computing > Newton HPC Program http://newton.utk.edu/ > The University of Tennessee, 2309 Kingston Pike, Knoxville, TN 37919 > Phone: 865-974-2448 > > /-------------------------------------\ > | One Contact OIT: 865-974-9900 | > | Many Solutions help.utk.edu | > \-------------------------------------/ > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
