>Do you mean why isn't the job running, even though it seems that it *should* >be running?
Exactly... >If so, I would say post the output of qstat -f for the job, and checkjob -v mahmood@srv1:~$ qstat -f 49153 Job Id: 49153.srv1 Job_Name = bwaves Job_Owner = mahmood@srv1 job_state = Q queue = Long server = srv1 Checkpoint = u ctime = Mon Sep 12 19:55:29 2011 Error_Path = srv1:/home/mahmood/multi2sim-3.0.3/410.bwave s/bwaves.e49153 Hold_Types = n Join_Path = oe Keep_Files = n Mail_Points = a mtime = Mon Sep 12 19:55:29 2011 Output_Path = srv1:/home/mahmood/multi2sim-3.0.3/410.bwav es/bwaves_128.out Priority = 0 qtime = Mon Sep 12 19:55:29 2011 Rerunable = True Resource_List.nodect = 1 Resource_List.nodes = node2 Resource_List.walltime = 960:00:00 Variable_List = PBS_O_QUEUE=Long,PBS_O_HOME=/home/mahmood, ... etime = Mon Sep 12 19:55:29 2011 submit_args = tor fault_tolerant = False mahmood@srv1:~$ checkjob -v 49153 checking job 49153 (RM job '49153.srv1') State: Idle Creds: user:mahmood group:mahmood class:Long qos:DEFAULT WallTime: 00:00:00 of 40:00:00:00 SubmitTime: Mon Sep 12 19:55:29 (Time Queued Total: 00:39:24 Eligible: 00:39:24) Total Tasks: 1 Req[0] TaskCount: 1 Partition: ALL Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [NONE] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 NodeAccess: SHARED NodeCount: 0 IWD: [NONE] Executable: [NONE] Bypass: 3 StartCount: 0 PartitionMask: [ALL] Flags: HOSTLIST RESTARTABLE HostList: [node2:1] PE: 1.00 StartPriority: 147 job can run in partition DEFAULT (8 procs available. 1 procs required) >which you seem to have manually selected in your qsub statement Yes, As you can see I requested node2 Resource_List.nodes = node2 and the output of "pbsnodes -l all" shows that this node is free mahmood@srv1:~$ pbsnodes -l all srv1 job-exclusive node2 free node3 job-exclusive node4 free Any idea about that? // Naderan *Mahmood; ----- Original Message ----- From: Steve Crusan <[email protected]> To: Mahmood Naderan <[email protected]> Cc: maui <[email protected]> Sent: Monday, September 12, 2011 6:17 PM Subject: Re: [Mauiusers] Job is in 'Q' but checkjob shows it is running (!) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 12, 2011, at 5:01 AM, Mahmood Naderan wrote: > > > Hi, > I sent this email to torque mailing list but seems that it is related to > maui. So I restate the problem here. > > Can someone explain why the qstat shows a job in "Q" but checkjob says > everything is normal? Looking below, the job is queued in TORQUE, and idle in Maui (not running), so everything is normal. Do you mean why isn't the job running, even though it seems that it *should* be running? If so, I would say post the output of qstat -f for the job, and checkjob -v. This seems to be more or less a scheduler configuration, or possibly an issue with the node (which you seem to have manually selected in your qsub statement). > > mahmood@srv1:416.gamess$ qstat 49003 > Job id Name User Time Use S Queue > ------------------------- ---------------- --------------- -------- - ----- > 49003.srv1 gamess mahmood 0 Q Long > > > mahmood@srv1:416.gamess$ checkjob 49003 > checking job 49003 > > State: Idle > Creds: user:mahmood group:mahmood class:Long qos:DEFAULT > WallTime: 00:00:00 of 40:00:00:00 > SubmitTime: Sun Sep 11 09:51:26 > (Time Queued Total: 00:02:36 Eligible: 00:02:36) > > Total Tasks: 1 > > Req[0] TaskCount: 1 Partition: ALL > Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 > Opsys: [NONE] Arch: [NONE] Features: [NONE] > > > IWD: [NONE] Executable: [NONE] > Bypass: 0 StartCount: 0 > PartitionMask: [ALL] > Flags: HOSTLIST RESTARTABLE > HostList: > [hawk:1] > PE: 1.00 StartPriority: 129 > job can run in partition DEFAULT (3 procs available. 1 procs required) > > Thanks > // Naderan *Mahmood; > > _______________________________________________ > mauiusers mailing list > [email protected] > http://www.supercluster.org/mailman/listinfo/mauiusers ---------------------- Steve Crusan System Administrator Center for Research Computing University of Rochester https://www.crc.rochester.edu/ -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org iQEcBAEBAgAGBQJObg2IAAoJENS19LGOpgqKAnIIAKHvbLmV9Hs31IZ4AGHIOFG9 Wxp+qiXOnIMoKQQjhkkou1zVC4OKHnymcE/LxtiQcAuX+Lu8gd/GAR1tF5FeCF4g m7go12yb5Dx97sHgl2SjmRY3duDkx6YMfOGgxCuiN+O5SdkUazuW8GPkW+HPPS7/ T3gDbG0jizZ6A5LzhJqgPyVC4LKkwYt5v9NQBs/f82ZOGqPusEWdJ4N5oaUYhyG/ OXSj/xmzMTCYCqfdOUZynq4ACQotRbNmY7wrV+Uc0qWUFtZv/RIwQ/O4P261E/1/ dfrVX3OEdz9FBy4uoNrgMyNxL2eOanNiKSlhHJnoM04zx0SkAYGDOeGPqYv/vi0= =QcC7 -----END PGP SIGNATURE----- _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
