Reuti,

I was wondering about both exit status being 0, of qrsh, and the error being set on the queue. The output of qacct is:

$ qacct -j 152
==============================================================
qname        all.q
hostname     exec_1
group        root
owner        root
project      NONE
department   defaultdepartment
jobname      QRLOGIN
jobnumber    152
taskid       undefined
account      sge
priority     0
qsub_time    Tue Mar  5 04:35:32 2013
start_time   -/-
end_time     -/-
granted_pe   NONE
slots        1
failed       11  : before job
exit_status  0
ru_wallclock 0
ru_utime     0.000
ru_stime     0.000
ru_maxrss    0
ru_ixrss     0
ru_ismrss    0
ru_idrss     0
ru_isrss     0
ru_minflt    0
ru_majflt    0
ru_nswap     0
ru_inblock   0
ru_oublock   0
ru_msgsnd    0
ru_msgrcv    0
ru_nsignals  0
ru_nvcsw     0
ru_nivcsw    0
cpu          0.000
mem          0.000
io           0.000
iow          0.000
maxvmem      0.000
arid         undefined

qrsh works, however, from the master host, which is both a submit and administration host: as is the host I ran the "failing" qrsh process.

Thanks,

Ian

On Mon, 04 Mar 2013 17:36:47 -0000, Reuti <[email protected]> wrote:

Am 04.03.2013 um 14:27 schrieb Ian Johnson:

Dear All,

I built release 2011.11p1 of Open Grid Engine and I'm having a problem with qrsh not scheduling an interactive job on an execution host. Invoking:

$ qrsh -q all.q -verbose
local configuration broker not defined - using global configuration
Your job 152 ("QRLOGIN") has been submitted
waiting for interactive job to be scheduled ...
$ echo $?
0

And the exit status is 0!

However, the queue is left in an error state:

---------------------------------------------------------------------------------
all.q@exec_1 BIP 0/0/4 0.00 linux-x64 E queue all.q marked QERROR as result of job 152's failure at host exec_1
---------------------------------------------------------------------------------

Would anyone know what's going on here, or has anyone seen this behaviour before?

What created the error?

Are you know wondering about the exit code being zero, or the queue being in error state for unknown reason? There might be something in the messages file of the qmaster or the node specific one.

What was recorded in:

$ qacct -j 152

-- Reuti



--
Thank you,

Ian Johnson
Software Engineer

Capita Translation and Interpreting
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
| [email protected] | Skype ID: ian.johnson_als
www.capitatranslationinterpreting.com
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users



--
Kind regards,

Ian Johnson
Software Engineer

Capita Translation and Interpreting
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
| [email protected] | Skype ID: ian.johnson_als
www.capitatranslationinterpreting.com
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to