Like the OP mentioned, one could use a consumable complex for 6.1. If you add "complex_values 
network=16" to the queue, and "load_thresholds network=15" it will be pushed to 
alarm state automatically and you can avoid the load sensor. When you add a default consumption of 
1, it works out-of-the-box (it's only subtracted if it's attached to a queue).

I.e. the other queue for normal jobs don't have it attached, and you select the 
special multi-node queue by the requested PE.
Unfortunately, I think there are two problems with this suggestion.

1. If I set network=16, then only 16 processors out of 48 will be usable by parallel jobs.

2. The use of a load threshold seems to prevent fill_up from working correctly, so even if I have network=48 for the queue complex and network=47 for the load threshold it will not use up all 48 slots before moving on to the next host. This seems to be due to the alarm state becoming active on the queues at inconsistent times during a single scheduling iteration. This would also affect the use of a custom load sensor, so I'm abandoning that idea.

If we were to update to 6.2u5, what options would we then have?

--
Gerald Ragghianti

Office of Information Technology - High Performance Computing
Newton HPC Program http://newton.utk.edu/
The University of Tennessee, 2309 Kingston Pike, Knoxville, TN 37919
Phone: 865-974-2448

/-------------------------------------\
| One Contact       OIT: 865-974-9900 |
| Many Solutions         help.utk.edu |
\-------------------------------------/


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to