Upgrade and see if you get different behavior, as this was fixed in
14.03.05 ;).
On 07/29/2014 12:26 PM, Bill Wichser wrote:
Lol. Missed that! 14.03.04
On 07/29/2014 02:01 PM, Danny Auble wrote:
14.03.05?
On July 29, 2014 8:41:25 AM PDT, Bill Wichser <[email protected]>
wrote:
Version currently demonstrating this is: 14.03
Bill
On 07/25/2014 09:44 PM, Danny Auble wrote:
What version are you using?
On July 25, 2014 5:12:22 PM PDT, Bill Wichser
<[email protected]> wrote:
Thanks. I knew that with our implementation of PBS it was always
this
way. But there was no indication from Slurm docs that the lower
7 bits
(-128) also applied for slurm.
My exit codes from sacct are always 137:0 and 139:0 from
these jobs.
Bill
On 7/25/2014 6:22 PM, Danny Auble wrote:
Paul is correct,
Before 14.03.5 Slurm didn't obey POSIX convention but now does.
Basically if the job was signaled in some fashion the exit
code is
! increased by 128 to show this is the case.
As an example on the command line, if I do a simple sleep and
ctrl-C
it the exit code would be 130
sleep 1000
^C
echo $?
130
Before 14.03.5 srun wouldn't return just 15 in this case but we
wanted
to be POSIX c! ompliant so we modified it to increase the
exit_code as
it should to be compliant.
What does sacct tell you on the jobs? For the exit code of 137 I
would expect you would get a ExitCode of 0:9 meaning you had an
exit
code of 0 but it was signaled with a SIGKILL. For the 139 I
would
expect a 0:11 meaning a Seg Fault happened just as Paul said.
Danny
On 07/25/2014 03:06 PM, Bill Wichser wrote:
From the documentation there is no clear explanation which
I find
explaining the exit codes of jobs. I have a user
experiencing exit
codes of 137 and 139. Can anyone help me to locate what this
8 bit
unsigned integer references?
Thanks,
Bill