HI,  everyone

I used blcr-0.7.3 and torque torque-2.4.0-snap.200809111541.tar.gz  to 
test the checkpoint/restart function according to 

the wiki: 
http://www.clusterresources.com/wiki/doku.php?id=torque:2.6_job_checkpoint_and_restart


I found an insteresting question, when I qhold the job, I'll see the 
checkpoint file located at 
/var/spool/torque/checkpoint/4817.node24.CK/ckpt.4817.node24.1221666102 

but when I qrls the same job 4817, the pbs_mom daemon at the compute node 
will down (killed by something).  Any clues? 

Thank you very much.


dolphin ,qin 
 
 
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to