BTW, I just tried upgrading to maui-3.3.1 and I still have the same issue. Maui 
segfaults when I try to start the maui process with this one job in the queue.
--
Steven DuChene

From: DuChene, StevenX A 
Sent: Monday, November 28, 2011 4:05 PM
To: [email protected]
Subject: maui segfaults trying to schedule a job

This morning I discovered that the maui scheduler process was not running on 
one of our clusters like it should. When I try to start the maui process as the 
maui user I get a segmentation fault. In checking the log files the last few 
entries look like this:

11/28 15:45:24 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
11/28 15:45:24 INFO:     job '231' Priority:      605
11/28 15:45:24 INFO:     Cred:      0(00.0)  FS:      0(00.0)  Attr:      
0(00.0)  Serv:    605(00.0)  Targ:      0(00.0)  Res:      0(00.0)  Us:      
0(00.0)
11/28 15:45:24 MStatClearUsage([NONE],Active)
11/28 15:45:24 INFO:     total jobs selected (ALL): 1/1
11/28 15:45:24 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
11/28 15:45:24 INFO:     job '231' Priority:      605
11/28 15:45:24 INFO:     Cred:      0(00.0)  FS:      0(00.0)  Attr:      
0(00.0)  Serv:    605(00.0)  Targ:      0(00.0)  Res:      0(00.0)  Us:      
0(00.0)
11/28 15:45:24 MStatClearUsage([NONE],Idle)
11/28 15:45:24 INFO:     total jobs selected (ALL): 1/1
11/28 15:45:24 
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE)
11/28 15:45:24 INFO:     total jobs selected in partition ALL: 1/1
11/28 15:45:24 MQueueScheduleRJobs(Q)
11/28 15:45:24 
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
11/28 15:45:24 INFO:     total jobs selected in partition ALL: 1/1
11/28 15:45:24 
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,DEFAULT,FReason,TRUE)
11/28 15:45:24 INFO:     total jobs selected in partition DEFAULT: 1/1
11/28 15:45:24 MQueueScheduleIJobs(Q,DEFAULT)
11/28 15:45:24 INFO:     156 feasible tasks found for job 231:0 in partition 
DEFAULT (39 Needed)
11/28 15:45:24 INFO:     156 feasible tasks found for job 231:1 in partition 
DEFAULT (39 Needed)
11/28 15:45:24 INFO:     156 feasible tasks found for job 231:2 in partition 
DEFAULT (39 Needed)
11/28 15:45:24 INFO:     156 feasible tasks found for job 231:3 in partition 
DEFAULT (39 Needed)
11/28 15:45:24 INFO:     156 feasible tasks found for job 231:4 in partition 
DEFAULT (16 Needed)

Prior to the above entries there are a WHOLE BUNCH of entries similar to these 
shown below:

11/28 15:45:24 MUGetIndex(TJC,ValList,0)
11/28 15:45:24 MUGetIndex(TNJA,ValList,0)
11/28 15:45:24 MUGetIndex(TNJC,ValList,0)
11/28 15:45:24 MUGetIndex(TNXF,ValList,0)
11/28 15:45:24 MUGetIndex(TPSD,ValList,0)
11/28 15:45:24 MUGetIndex(TPSE,ValList,0)
11/28 15:45:24 MUGetIndex(TPSR,ValList,0)
11/28 15:45:24 MUGetIndex(TPSU,ValList,0)
11/28 15:45:24 MUGetIndex(TQM,ValList,0)
11/28 15:45:24 MUGetIndex(TQT,ValList,0)
11/28 15:45:24 MUGetIndex(TRT,ValList,0)
11/28 15:45:24 MUGetIndex(TXF,ValList,0)

There is only this one job in the queue on a 256 node cluster running torque 
2.5.7 and maui 3.2.6p21 

I have tried starting the maui process within strace but I do not see any 
smoking gun in that strace output.

I can probably get maui to start if I qdel the job but I was sort of hoping to 
see what was causing the problem in case any additional debugging output was 
needed.
--
Steven DuChene
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to