Am 14.01.2014 um 18:27 schrieb Ian Johnson:

> Reuti,
> 
> There's no file staging installed. The job script is being copied to the 
> execution host.

Correct (for the job script itself).


> The output file *is* being opened in ~smartmate but it is of zero length.

I would assume that they is not created at all in this location, only on the 
nodes. Or do you mean the home directory on the nodes?

NB: In Torque there is a file staging for the .o/.e files, but not in SGE.

-- Reuti


> Thanks,
> 
> Ian
> 
> On Tue, 14 Jan 2014 17:18:06 -0000, Reuti <[email protected]> wrote:
> 
>> Am 14.01.2014 um 18:04 schrieb Ian Johnson:
>> 
>>> Reuti,
>>> 
>>> There is no output from the script at all in the ~smartmate/job.sh.o[0-9]+ 
>>> files. The home directory of the smartmate user is local disk. However, 
>>> grid engine is installed on an NFS share.
>> 
>> Do you have any file staging installed? Otherwise the output will not be 
>> send to the real home directory of the user. Also the input files could be 
>> missing on the execution host.
>> 
>> -- Reuti
>> 
>> 
>> 
>>> Is there other information you require? Is there any way to get the 
>>> function call that is failing in shepherd, e.g. more verbose tracing?
>>> 
>>> Thanks,
>>> 
>>> Ian
>>> 
>>> On Tue, 14 Jan 2014 15:19:34 -0000, Reuti <[email protected]> 
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Am 14.01.2014 um 15:19 schrieb Ian Johnson:
>>>> 
>>>>> I have a simple job, which echoes `date` to stdout, that I'm using to 
>>>>> test an Open Grid Engine installation. Running qsub as root the job is 
>>>>> run successfully. However, using another non-superuser, in this case 
>>>>> smartmate user, the output from qacct -j says that the job has exited 
>>>>> with exit status 11. The shepherd trace confirms this (see below).
>>>> 
>>>> Do you have any output? 11 means "Resource temporarily unavailable", which 
>>>> could mean it can't write to the (mounted?) home directory of the user. 
>>>> How is it mount configured?
>>>> 
>>>> AFAICS the user is known, as otherwise you would face a different error.
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> Would anyone have an idea as to what is going on? Thank you.
>>>>> 
>>>>> <shepherd_trace>
>>>>> 01/14/2014 14:08:56 [0:2723]: shepherd called with uid = 0, euid = 0
>>>>> 01/14/2014 14:08:56 [0:2723]: starting up 2011.11
>>>>> 01/14/2014 14:08:56 [0:2723]: setpgid(2723, 2723) returned 0
>>>>> 01/14/2014 14:08:56 [0:2723]: do_core_binding: "binding" parameter not 
>>>>> found in config file
>>>>> 01/14/2014 14:08:56 [0:2723]: no prolog script to start
>>>>> 01/14/2014 14:08:56 [0:2723]: parent: forked "job" with pid 2724
>>>>> 01/14/2014 14:08:56 [0:2724]: child: starting son(job, 
>>>>> /opt/capitati/ge2011.11/smartmate/spool/exec-1/job_scripts/32, 0);
>>>>> 01/14/2014 14:08:56 [0:2724]: pid=2724 pgrp=2724 sid=2724 old pgrp=2723 
>>>>> getlogin()=root
>>>>> 01/14/2014 14:08:56 [0:2723]: parent: job-pid: 2724
>>>>> 01/14/2014 14:08:56 [0:2724]: reading passwd information for user 
>>>>> 'smartmate'
>>>>> 01/14/2014 14:08:56 [0:2724]: setosjobid: uid = 0, euid = 0
>>>>> 01/14/2014 14:08:56 [0:2724]: setting limits
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_CPU setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_FSIZE setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_DATA setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_STACK setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_CORE setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_VMEM/RLIMIT_AS setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_RSS setting: (soft 
>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>> 18446744073709551615(INFINITY))
>>>>> 01/14/2014 14:08:56 [0:2724]: setting environment
>>>>> 01/14/2014 14:08:56 [0:2724]: Initializing error file
>>>>> 01/14/2014 14:08:56 [0:2724]: switching to intermediate/target user
>>>>> 01/14/2014 14:08:56 [0:2723]: wait3 returned 2724 (status: 2816; 
>>>>> WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 11)
>>>>> 01/14/2014 14:08:56 [0:2723]: job exited with exit status 11
>>>>> 01/14/2014 14:08:56 [0:2723]: reaped "job" with pid 2724
>>>>> 01/14/2014 14:08:56 [0:2723]: job exited not due to signal
>>>>> 01/14/2014 14:08:56 [0:2723]: job exited with status 11
>>>>> 01/14/2014 14:08:56 [0:2723]: now sending signal KILL to pid -2724
>>>>> 01/14/2014 14:08:56 [0:2723]: writing usage file to "usage"
>>>>> 01/14/2014 14:08:56 [0:2723]: no tasker to notify
>>>>> 01/14/2014 14:08:56 [0:2723]: no epilog script to start
>>>>> </shepherd_trace>
>>>>> 
>>>>> <job_script>
>>>>> #!/bin/bash
>>>>> #
>>>>> #$ -j y
>>>>> #
>>>>> #$ -S /bin/bash
>>>>> 
>>>>> echo "Hello World"
>>>>> echo `date`
>>>>> </job_script>
>>>>> 
>>>>> Ian Johnson
>>>>> Software Engineer
>>>>> 
>>>>> 
>>>>> Capita Translation and Interpreting
>>>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): 
>>>>> +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>> | [email protected] | Skype ID: ian.johnson_als
>>>>> www.capitatranslationinterpreting.com
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>>> 
>>> 
>>> 
>>> --
>>> Kind regards,
>>> 
>>> Ian Johnson
>>> Software Engineer
>>> 
>>> Capita Translation and Interpreting
>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 
>>> 845 367 7000 | Tel (US): +1 (800) 579-5010
>>> | [email protected] | Skype ID: ian.johnson_als
>>> www.capitatranslationinterpreting.com
>> 
> 
> 
> -- 
> Kind regards,
> 
> Ian Johnson
> Software Engineer
> 
> Capita Translation and Interpreting
> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 
> 845 367 7000 | Tel (US): +1 (800) 579-5010
> | [email protected] | Skype ID: ian.johnson_als
> www.capitatranslationinterpreting.com


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to