Gerard Henry <[email protected]> writes:

> hello all,
>
> in 6.2u5, a job finished, but the program was killed (*) due to lack
> of ressources (no more ram):
> in sge err:
>    /local/export/sge/default/spool/charybde/job_scripts/14561: line
> 12: 25776 Killed ./benchruntime
>
> I runned it without SGE, this program needs more ram than available on
> the host. But is it normal that qacct says that everything is ok?
> failed       0
> exit_status  0
>
>
> Does it mean that i can only interpret the non empty sge err file to
> tell that the program exits abnormally?

It depends on the job script.  The shell will typically only report the
exit status of the last command.  Most of the scripts in use here 
annoyingly end by printing messages and don't save the exit code.

  $ qsub -sync y
  false
  echo finished
  Your job 122544 ("STDIN") has been submitted
  qacct Job 122544 exited with exit code 0.
  $ qacct -j 122544|grep exit_status
  exit_status  0                   
  $ qsub -sync y
  false
  Your job 122545 ("STDIN") has been submitted
  qacct -j 122545Job 122545 exited with exit code 1.
  $ qacct -j 122545|grep exit_status
  exit_status  1
  $
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to