Ahh nevermind. Exit code 38 seems to be Input/output error. On Mon, Apr 11, 2016 at 9:46 AM, Jagga Soorma <jagg...@gmail.com> wrote:
> Hi Guys, > > Where can I find out information on the slurm exit codes? Can't seem to > find what exit code 38 means. I have a job that died with exit code 38 and > going through the logs we are not able to figure out what might have gone > wrong. Here are the logs in case we might be missing something: > > -- > ..snip.. > [301] energycounted = 0 > [301] getjoules_task energy = 0 > [301] Job 301 memory used:588956 limit:4194304 KB > [301] getjoules_task energy = 0 > [301] removing task 0 pid 63380 from jobacct > [301] task 0 (63380) exited with exit code 38. > [301] task_p_post_term: 301.4294967294, task 0 > [301] cpu_freq_reset: #cpus reset = 0 > [301] Aggregated 1 task exit messages > [301] sending task exit msg for 1 tasks status 9728 > [301] Before call to spank_fini() > [301] After call to spank_fini() > [301] job 301 completed with slurm_rc = 0, job_rc = 9728 > [301] sending REQUEST_COMPLETE_BATCH_SCRIPT, error:0 status 9728 > [301] Called _msg_socket_readable > [301] false, shutdown > [301] Message thread exited > [301] done with job > debug: task_p_slurmd_release_resources: 301 > debug3: state for jobid 301: ctime:1459111532 revoked:0 expires:0 > debug: credential for job 301 revoked > debug3: Step from other job: jobid=49298 (this jobid=301) > debug2: No steps in jobid 301 to send signal 999 > debug3: Step from other job: jobid=49298 (this jobid=301) > debug2: No steps in jobid 301 to send signal 18 > debug3: Step from other job: jobid=49298 (this jobid=301) > debug2: No steps in jobid 301 to send signal 15 > debug2: set revoke expiration for jobid 301 to 1459889405 UTS > debug3: state for jobid 301: ctime:1459111532 revoked:1459888205 > expires:1459888205 > debug3: destroying job 301 state > ..snip.. > > # sjobexitmod -l 301 > JobID Account NNodes NodeList State ExitCode > DerivedExitCode Comment > ------------ ---------- -------- --------------- ---------- -------- > --------------- -------------- > 301 prod 1 node234 FAILED 38:0 > 0:0 -- > -- > > Thanks. >