Reuti,
There's no file staging installed. The job script is being copied to the
execution host.
The output file *is* being opened in ~smartmate but it is of zero length.
Thanks,
Ian
On Tue, 14 Jan 2014 17:18:06 -0000, Reuti <[email protected]>
wrote:
Am 14.01.2014 um 18:04 schrieb Ian Johnson:
Reuti,
There is no output from the script at all in the
~smartmate/job.sh.o[0-9]+ files. The home directory of the smartmate
user is local disk. However, grid engine is installed on an NFS share.
Do you have any file staging installed? Otherwise the output will not be
send to the real home directory of the user. Also the input files could
be missing on the execution host.
-- Reuti
Is there other information you require? Is there any way to get the
function call that is failing in shepherd, e.g. more verbose tracing?
Thanks,
Ian
On Tue, 14 Jan 2014 15:19:34 -0000, Reuti <[email protected]>
wrote:
Hi,
Am 14.01.2014 um 15:19 schrieb Ian Johnson:
I have a simple job, which echoes `date` to stdout, that I'm using to
test an Open Grid Engine installation. Running qsub as root the job
is run successfully. However, using another non-superuser, in this
case smartmate user, the output from qacct -j says that the job has
exited with exit status 11. The shepherd trace confirms this (see
below).
Do you have any output? 11 means "Resource temporarily unavailable",
which could mean it can't write to the (mounted?) home directory of
the user. How is it mount configured?
AFAICS the user is known, as otherwise you would face a different
error.
-- Reuti
Would anyone have an idea as to what is going on? Thank you.
<shepherd_trace>
01/14/2014 14:08:56 [0:2723]: shepherd called with uid = 0, euid = 0
01/14/2014 14:08:56 [0:2723]: starting up 2011.11
01/14/2014 14:08:56 [0:2723]: setpgid(2723, 2723) returned 0
01/14/2014 14:08:56 [0:2723]: do_core_binding: "binding" parameter
not found in config file
01/14/2014 14:08:56 [0:2723]: no prolog script to start
01/14/2014 14:08:56 [0:2723]: parent: forked "job" with pid 2724
01/14/2014 14:08:56 [0:2724]: child: starting son(job,
/opt/capitati/ge2011.11/smartmate/spool/exec-1/job_scripts/32, 0);
01/14/2014 14:08:56 [0:2724]: pid=2724 pgrp=2724 sid=2724 old
pgrp=2723 getlogin()=root
01/14/2014 14:08:56 [0:2723]: parent: job-pid: 2724
01/14/2014 14:08:56 [0:2724]: reading passwd information for user
'smartmate'
01/14/2014 14:08:56 [0:2724]: setosjobid: uid = 0, euid = 0
01/14/2014 14:08:56 [0:2724]: setting limits
01/14/2014 14:08:56 [0:2724]: RLIMIT_CPU setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_FSIZE setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_DATA setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_STACK setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_CORE setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_VMEM/RLIMIT_AS setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: RLIMIT_RSS setting: (soft
18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY))
resulting: (soft 18446744073709551615(INFINITY), hard
18446744073709551615(INFINITY))
01/14/2014 14:08:56 [0:2724]: setting environment
01/14/2014 14:08:56 [0:2724]: Initializing error file
01/14/2014 14:08:56 [0:2724]: switching to intermediate/target user
01/14/2014 14:08:56 [0:2723]: wait3 returned 2724 (status: 2816;
WIFSIGNALED: 0, WIFEXITED: 1, WEXITSTATUS: 11)
01/14/2014 14:08:56 [0:2723]: job exited with exit status 11
01/14/2014 14:08:56 [0:2723]: reaped "job" with pid 2724
01/14/2014 14:08:56 [0:2723]: job exited not due to signal
01/14/2014 14:08:56 [0:2723]: job exited with status 11
01/14/2014 14:08:56 [0:2723]: now sending signal KILL to pid -2724
01/14/2014 14:08:56 [0:2723]: writing usage file to "usage"
01/14/2014 14:08:56 [0:2723]: no tasker to notify
01/14/2014 14:08:56 [0:2723]: no epilog script to start
</shepherd_trace>
<job_script>
#!/bin/bash
#
#$ -j y
#
#$ -S /bin/bash
echo "Hello World"
echo `date`
</job_script>
Ian Johnson
Software Engineer
Capita Translation and Interpreting
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel
(UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
| [email protected] | Skype ID: ian.johnson_als
www.capitatranslationinterpreting.com
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
--
Kind regards,
Ian Johnson
Software Engineer
Capita Translation and Interpreting
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK):
+44 845 367 7000 | Tel (US): +1 (800) 579-5010
| [email protected] | Skype ID: ian.johnson_als
www.capitatranslationinterpreting.com
--
Kind regards,
Ian Johnson
Software Engineer
Capita Translation and Interpreting
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44
845 367 7000 | Tel (US): +1 (800) 579-5010
| [email protected] | Skype ID: ian.johnson_als
www.capitatranslationinterpreting.com
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users