Hello,
I have searched the archive and google'd but can not seem to find the answer to my particular problem. I have a 66 Node blade cluster with Infiniband. Each blade is dual socket, dual core and there are 10 blades to a blade center. I am connecting 20 blades to a 24Port Infiniband Switch. So have 4 IB Switchs, switches A, B, and C have 20 ports used on them, and switch D has 6 ports used. The remaining 4 ports connect back to a 5th IB switch as the concentrator. I am trying to define a POLICY such that jobs runs for 20 nodes or smaller will stick to a single IB switch, but partially used switches will still be available to be scheduled. So I have put the following into my maui.cfg:

NODESETPOLICY ONEOF
NODESETPRIORITYTYPE BESTRESOURCE
NODESETATTRIBUTE FEATURE
NODESETDELAY 0
NODESETLIST switchA switchB switchC switchD

I am using Torque for my resource manager, and switchA, switchB, switchC, and switchD are setup in Torque's "nodes" file as a feature. If I submit three 16 node jobs, the jobs goto to nodes 0-15 (switchA), nodes 20-35 (switchB), and nodes 40-55 (switchC). Which is exactly what I want to happen. However, when I submit the 4th 16 nodes job run, the job sits in the queue till one of the first 3 finish even though there are still 18 nodes free.
        Looking through the Maui log file I see the following:

03/29 13:40:30 MJobSelectResourceSet(35201,1,1,SetList,NodeList,66)
03/29 13:40:30 INFO:     set[0] 0 0
03/29 13:40:30 INFO:     set[1] 1 80
03/29 13:40:30 INFO:     set[2] 2 80
03/29 13:40:30 INFO:     set[3] 3 80
03/29 13:40:30 INFO:     set[4] 4 24
03/29 13:40:30 INFO: 240 feasible tasks found for job 35201:0 in partition DEFAULT (64 Needed)
03/29 13:40:30 MJobGetINL(35201,FNL,INL,DEFAULT,NodeCount,TaskCount)

This to me says its seeing my 4 entries from my nodeset lists. (4 processors per blade * 20 blades per switch). However it is only getting "240 feasible tasks found" instead of 266 (4 processors * 66 nodes), which means it isn't picking up the 24 processors available on switchD. Farther down in the log file I see:

03/29 13:40:30 INFO: idle resources (48 tasks/12 nodes) found with feasible
list specified
03/29 13:40:30 INFO: insufficient idle tasks in partition DEFAULT for 35201:0: (48 of 64 available)

The other thing is. If I submit 16 nodes, 16 nodes, 16 nodes, 12 nodes it will run immediately. If I submit 16, 16, 16, 12, and 6 it will run immediately. For some reason "switchD" isn't getting lumped into the other switches as available nodes for the job. I am guessing its a weighting issue because of the difference between "80" prcoessors on A, B, C and "24" on D:

3/29 13:40:30 INFO:      set[1] 1 80
03/29 13:40:30 INFO:     set[2] 2 80
03/29 13:40:30 INFO:     set[3] 3 80
03/29 13:40:30 INFO:     set[4] 4 24

Anyone have any suggestions on how I would correctly compensate for this in maui.cfg?

        Thanks,
                -Brad Viviano
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to