> 07/07/2009 
> 17:42:34;0010;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Exit_status=-2
> 07/07/2009 17:42:34;000d;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Post 
> job file processing error; job 178.tornado.soe.cranfield.ac.uk on host 
> node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0
07/07/2009 
17:42:34;0100;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;dequeuing from 
batch, state EXITING

This is the key set of lines here.  "Post job file processing error." 
 
Check the syslog on this node, see if there is anything about the pbs_mom 
having problems copying/moving/renaming/whatever files that it was trying to 
process.
 
--Joe

________________________________

From: Jan Kowalik [mailto:kowalik....@gmail.com]
Sent: Tue 7/7/2009 1:06 PM
To: oscar-users@lists.sourceforge.net
Subject: Re: [Oscar-users] qsub problem



> It would help a lot if you can tell us which version of OSCAR you are using 
> and which Linux distro you installed.
CentOS 4.3
OSCAR 4.2 - but I am not sure.. how can I check it ?


I have run my simple job again and checked logs, this is what I've got:

$ echo "sleep 60;date" | /opt/pbs/bin/qsub
178.tornado.soe.cranfield.ac.uk
$


=======================================================================
$ cat /var/spool/pbs/server_logs/20090707 | grep "07/07/2009 17:42"
...
07/07/2009 17:42:34;0008;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Job
Modified at request of r...@tornado.soe.cranfield.ac.uk
07/07/2009 17:42:34;0100;PBS_Server;Req;;Type RunJob request received
from r...@tornado.soe.cranfield.ac.uk, sock=9
07/07/2009 17:42:34;0008;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Job
Run at request of r...@tornado.soe.cranfield.ac.uk
07/07/2009 17:42:34;0100;PBS_Server;Req;;Type JobObituary request
received from pbs_...@node8.soe.cranfield.ac.uk, sock=11
07/07/2009 
17:42:34;0010;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Exit_status=-2
07/07/2009 17:42:34;000d;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;Post
job file processing error; job 178.tornado.soe.cranfield.ac.uk on host
node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0
07/07/2009 
17:42:34;0100;PBS_Server;Job;178.tornado.soe.cranfield.ac.uk;dequeuing
from batch, state EXITING
07/07/2009 17:42:34;0040;PBS_Server;Svr;tornado.soe.cranfield.ac.uk;Scheduler
sent command term
07/07/2009 17:42:35;0100;PBS_Server;Req;;Type StatusNode request
received from r...@tornado.soe.cranfield.ac.uk, sock=9
07/07/2009 17:42:35;0100;PBS_Server;Req;;Type StatusQueue request
received from r...@tornado.soe.cranfield.ac.uk, sock=9
...
=======================================================================


=======================================================================
$ cat /var/spool/pbs/server_priv/accounting/20090707 | grep "178.tornado"

07/07/2009 17:42:33;Q;178.tornado.soe.cranfield.ac.uk;queue=batch
07/07/2009 17:42:34;S;178.tornado.soe.cranfield.ac.uk;user=janek
group=janek jobname=STDIN queue=batch ctime=1246984953
qtime=1246984953 etime=1246984953 start=1246984954
exec_host=node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0
Resource_List.neednodes=node8.soe.cranfield.ac.uk+node7.soe.cranfield.ac.uk+node6.soe.cranfield.ac.uk+node5.soe.cranfield.ac.uk+node4.soe.cranfield.ac.uk+node3.soe.cranfield.ac.uk+node2.soe.cranfield.ac.uk+node1.soe.cranfield.ac.uk
Resource_List.nodes=8
07/07/2009 17:42:34;E;178.tornado.soe.cranfield.ac.uk;user=janek
group=janek jobname=STDIN queue=batch ctime=1246984953
qtime=1246984953 etime=1246984953 start=1246984954
exec_host=node8.soe.cranfield.ac.uk/0+node7.soe.cranfield.ac.uk/0+node6.soe.cranfield.ac.uk/0+node5.soe.cranfield.ac.uk/0+node4.soe.cranfield.ac.uk/0+node3.soe.cranfield.ac.uk/0+node2.soe.cranfield.ac.uk/0+node1.soe.cranfield.ac.uk/0
Resource_List.neednodes=node8.soe.cranfield.ac.uk+node7.soe.cranfield.ac.uk+node6.soe.cranfield.ac.uk+node5.soe.cranfield.ac.uk+node4.soe.cranfield.ac.uk+node3.soe.cranfield.ac.uk+node2.soe.cranfield.ac.uk+node1.soe.cranfield.ac.uk
Resource_List.nodes=8 session=0 end=1246984954 Exit_status=-2

=======================================================================


Can you see something in it, what will help solve the issue ?



Regards

--
Jan

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge 
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize
details at: http://p.sf.net/sfu/blackberry
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users


<<winmail.dat>>

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have 
the opportunity to enter the BlackBerry Developer Challenge. See full prize 
details at: http://p.sf.net/sfu/blackberry
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to