Hai everybody 
    
  Iam using AIX Loadleveler3.1 for checkpointing my simple serial 
application.The problem is while generating ckeckpoint file.It generates ckpt 
file name with extension .err,here(stp.ckpt.err).when restarted_from_ckpt is 
set to yes in job command file and run the job ,the node simply remove the job 
from the queue and even could not get output file. 
    
                                I am posting my job command file and 
application here.Please reply if anybody knows what is the problem for not 
generating correct ckpt file ,how to debug the problem.Tnx in advance 
    
    
  My job command file 
    
  # For First.c 
# @ job_type = serial 
# @ executable = first 
# @ output = stp.out 
# @ error = stp.err 
# @ class = general 
# @ checkpoint = yes 
# @ restart_from_ckpt = yes 
# @ ckpt_dir = /home/rtsg/crypt/ramakrishna/trial/ex/ 
# @ ckpt_file = stp.ckpt 
# @ restart_on_same_nodes = yes 
# @ requirements = Machine == "tf04" 
# @ wall_clock_limit = 5:00:00,4:30:00 
# @ queue 
      
  My application 
    
  #include<stdio.h> 
#include "llapi.h" 
int main() 
{ 
 int i; 
 LL_ckpt_info ckpt_info; 
 cr_error_t cp_error1; 
  
 ckpt_info.version = LL_API_VERSION; 
 ckpt_info.step_id = NULL; 
 ckpt_info.ckptType=NULL; 
 ckpt_info.waitType=NULL; 
 ckpt_info.abort_sig=NULL; 
 ckpt_info.cp_error_data=&cp_error1; 
 ckpt_info.ckpt_rc=0; 
 ckpt_info.soft_limit=0; 
 ckpt_info.hard_limit=0; 
 for(i=1;i<4000;i++) 
 { 
  printf("%d\n",i); 
  if(i==2000) 
   ll_init_ckpt(&ckpt_info ); 
 } 
 return 0; 
} 



RAM....!
 
---------------------------------
Sucker-punch spam with award-winning protection.
 Try the free Yahoo! Mail Beta.
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to