> 07/07/2009 > 17:42:34;0010;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Exit_status=-2 > 07/07/2009 17:42:34;000d;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Post > job file processing error; job 178.tornado.soe.cranfield.ac.uk on host > node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0 07/07/2009 17:42:34;0100;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;dequeuing from batch, state EXITING
This is the key set of lines here. "Post job file processing error." Check the syslog on this node, see if there is anything about the pbs_mom having problems copying/moving/renaming/whatever files that it was trying to process. --Joe ________________________________ From: Jan Kowalik [mailto:kowalik....@gmail.com] Sent: Tue 7/7/2009 1:06 PM To: oscar-users@lists.sourceforge.net Subject: Re: [Oscar-users] qsub problem > It would help a lot if you can tell us which version of OSCAR you are using > and which Linux distro you installed. CentOS 4.3 OSCAR 4.2 - but I am not sure.. how can I check it ? I have run my simple job again and checked logs, this is what I've got: $ echo "sleep 60;date" | /opt/pbs/bin/qsub 178.tornado.soe.cranfield.ac.uk $ ======================================================================= $ cat /var/spool/pbs/server_logs/20090707 | grep "07/07/2009 17:42" ... 07/07/2009 17:42:34;0008;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Job Modified at request of r...@tornado.soe.cranfield.ac.uk 07/07/2009 17:42:34;0100;PBS_Server;Req;;Type RunJob request received from r...@tornado.soe.cranfield.ac.uk, sock=9 07/07/2009 17:42:34;0008;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Job Run at request of r...@tornado.soe.cranfield.ac.uk 07/07/2009 17:42:34;0100;PBS_Server;Req;;Type JobObituary request received from pbs_...@node8.soe.cranfield.ac.uk, sock=11 07/07/2009 17:42:34;0010;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Exit_status=-2 07/07/2009 17:42:34;000d;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Post job file processing error; job 178.tornado.soe.cranfield.ac.uk on host node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0 07/07/2009 17:42:34;0100;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;dequeuing from batch, state EXITING 07/07/2009 17:42:34;0040;PBS_Server;Svr;tornado.soe.cranfield.ac.uk;Scheduler sent command term 07/07/2009 17:42:35;0100;PBS_Server;Req;;Type StatusNode request received from r...@tornado.soe.cranfield.ac.uk, sock=9 07/07/2009 17:42:35;0100;PBS_Server;Req;;Type StatusQueue request received from r...@tornado.soe.cranfield.ac.uk, sock=9 ... ======================================================================= ======================================================================= $ cat /var/spool/pbs/server_priv/accounting/20090707 | grep "178.tornado" 07/07/2009 17:42:33;Q;178.tornado.soe.cranfield.ac.uk;queue=batch 07/07/2009 17:42:34;S;178.tornado.soe.cranfield.ac.uk;user=janek group=janek jobname=STDIN queue=batch ctime=1246984953 qtime=1246984953 etime=1246984953 start=1246984954 exec_host=node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0 Resource_List.neednodes=node8.soe.cranfield.ac.uk+node7.soe.cranfield.ac.uk+node6.soe.cranfield.ac.uk+node5.soe.cranfield.ac.uk+node4.soe.cranfield.ac.uk+node3.soe.cranfield.ac.uk+node2.soe.cranfield.ac.uk+node1.soe.cranfield.ac.uk Resource_List.nodes=8 07/07/2009 17:42:34;E;178.tornado.soe.cranfield.ac.uk;user=janek group=janek jobname=STDIN queue=batch ctime=1246984953 qtime=1246984953 etime=1246984953 start=1246984954 exec_host=node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0 Resource_List.neednodes=node8.soe.cranfield.ac.uk+node7.soe.cranfield.ac.uk+node6.soe.cranfield.ac.uk+node5.soe.cranfield.ac.uk+node4.soe.cranfield.ac.uk+node3.soe.cranfield.ac.uk+node2.soe.cranfield.ac.uk+node1.soe.cranfield.ac.uk Resource_List.nodes=8 session=0 end=1246984954 Exit_status=-2 ======================================================================= Can you see something in it, what will help solve the issue ? Regards -- Jan ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/blackberry _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users
<<winmail.dat>>
------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/blackberry
_______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users