Am 21.01.2014 um 10:55 schrieb Ian Johnson:

> Reuti,
> 
> Writing to the /tmp directory produces the same results. However, looking in 
> the /tmp I saw some shepherd log files containing:
> 
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/trace) 
> failed: Permission denied
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/trace) 
> failed: Permission denied
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/trace) 
> failed: Permission denied
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/trace) 
> failed: Permission denied
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/error) 
> failed: Permission denied
> 01/21/2014 09:47:17 [1001:1001 2433]: PANIC: 
> creat(/opt/capitati/ge2011.11/smartmate/spool/exec-1/active_jobs/48.1/exit_status)
>  failed: Permission denied
> 
> However, these files are created and contain data! The uid and gid 1001 is 
> the smartmate uid and gid. Plus, my other installations of GE2011.11 have the 
> spool directory ownership of root:root.

I set them always to the SGE admin as owner and his group. Is root also your 
SGE admin or did you define any other:

$ cat /usr/sge/default/common/bootstrap 
# Version: 6.2u5
#
admin_user             sgeadmin
...
$ ls -lhd /var/spool/sge/
drwxr-xr-x 3 sgeadmin gridware 4.0K 2013-07-29 12:06 /var/spool/sge/

Are the permissions different on a higher level in the complete path? Any ACLs 
for them in place?

-- Reuti


> Thanks,
> 
> Ian
> 
> 
> On Mon, 20 Jan 2014 17:49:25 -0000, Reuti <[email protected]> wrote:
> 
>> Am 20.01.2014 um 18:04 schrieb Ian Johnson:
>> 
>>> Same behaviour with a prolog script. Details follow...
>> 
>> Good (or not). What happens, when you write to a local directory like /tmp?
>> 
>> -- Reuti
>> 
>> 
>>> The prolog script:
>>> 
>>> <prolog_script>
>>> #!/bin/bash
>>> 
>>> cd /opt/capitati/smartmate_data/test
>>> 
>>> echo "${JOB_ID}" > job.${JOB_ID}.prolog
>>> </prolog_script>
>>> 
>>> And is configured globally.
>>> 
>>> 
>>> 
>>> Running qsub as root yields:
>>> 
>>> The prolog script runs and outputs the following to 
>>> /opt/capitati/smartmate_data/test
>>> 
>>> -rw-rw----+ 1 root smartmate  3 Jan 20  2014 job.44.prolog
>>> 
>>> And job script creates and writes its stdout and stderr files:
>>> 
>>> -rw-r-----+ 1 root smartmate  0 Jan 20  2014 job_root_err.log
>>> -rw-r-----+ 1 root smartmate 41 Jan 20  2014 job_root_std.log
>>> 
>>> 
>>> 
>>> Running qsub as smartmate user yields:
>>> 
>>> The prolog file is created and written to:
>>> 
>>> -rw-rw----+ 1 smartmate smartmate  3 Jan 20  2014 job.45.prolog
>>> 
>>> However, the job script fails:
>>> 
>>> <shepherd_trace>
>>> 01/20/2014 17:01:41 [0:2011]: shepherd called with uid = 0, euid = 0
>>> 01/20/2014 17:01:41 [0:2011]: starting up 2011.11
>>> 01/20/2014 17:01:41 [0:2011]: setpgid(2011, 2011) returned 0
>>> 01/20/2014 17:01:41 [0:2011]: do_core_binding: "binding" parameter not 
>>> found in config file
>>> 01/20/2014 17:01:41 [0:2011]: parent: forked "prolog" with pid 2012
>>> 01/20/2014 17:01:41 [0:2012]: child: starting son(prolog, 
>>> /opt/capitati/smartmate_data/test/prolog.sh, 0);
>>> 01/20/2014 17:01:41 [0:2011]: using signal delivery delay of 120 seconds
>>> 01/20/2014 17:01:41 [0:2012]: pid=2012 pgrp=2012 sid=2012 old pgrp=2011 
>>> getlogin()=<no login set>
>>> 01/20/2014 17:01:41 [0:2011]: parent: prolog-pid: 2012
>>> 01/20/2014 17:01:41 [0:2012]: reading passwd information for user 
>>> 'smartmate'
>>> 01/20/2014 17:01:41 [0:2012]: setting limits
>>> 01/20/2014 17:01:41 [0:2012]: setting environment
>>> 01/20/2014 17:01:41 [0:2012]: Initializing error file
>>> 01/20/2014 17:01:41 [0:2012]: switching to intermediate/target user
>>> 01/20/2014 17:01:41 [0:2011]: wait3 returned 2012 (status: 0; WIFSIGNALED: 
>>> 0,  WIFEXITED: 1, WEXITSTATUS: 0)
>>> 01/20/2014 17:01:41 [0:2011]: prolog exited with exit status 0
>>> 01/20/2014 17:01:41 [0:2011]: reaped "prolog" with pid 2012
>>> 01/20/2014 17:01:41 [0:2011]: prolog exited not due to signal
>>> 01/20/2014 17:01:41 [0:2011]: prolog exited with status 0
>>> 01/20/2014 17:01:41 [0:2011]: parent: forked "job" with pid 2013
>>> 01/20/2014 17:01:41 [0:2013]: child: starting son(job, 
>>> /opt/capitati/ge2011.11/smartmate/spool/exec-1/job_scripts/45, 0);
>>> 01/20/2014 17:01:41 [0:2013]: pid=2013 pgrp=2013 sid=2013 old pgrp=2011 
>>> getlogin()=<no login set>
>>> 01/20/2014 17:01:41 [0:2011]: parent: job-pid: 2013
>>> 01/20/2014 17:01:41 [0:2013]: reading passwd information for user 
>>> 'smartmate'
>>> 01/20/2014 17:01:41 [0:2013]: setosjobid: uid = 0, euid = 0
>>> 01/20/2014 17:01:41 [0:2013]: setting limits
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_CPU setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_FSIZE setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_DATA setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_STACK setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_CORE setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_VMEM/RLIMIT_AS setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: RLIMIT_RSS setting: (soft 
>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>> 18446744073709551615(INFINITY))
>>> 01/20/2014 17:01:41 [0:2013]: setting environment
>>> 01/20/2014 17:01:41 [0:2013]: Initializing error file
>>> 01/20/2014 17:01:41 [0:2013]: switching to intermediate/target user
>>> 01/20/2014 17:01:41 [0:2011]: wait3 returned 2013 (status: 2816; 
>>> WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 11)
>>> 01/20/2014 17:01:41 [0:2011]: job exited with exit status 11
>>> 01/20/2014 17:01:41 [0:2011]: reaped "job" with pid 2013
>>> 01/20/2014 17:01:41 [0:2011]: job exited not due to signal
>>> 01/20/2014 17:01:41 [0:2011]: job exited with status 11
>>> 01/20/2014 17:01:41 [0:2011]: now sending signal KILL to pid -2013
>>> 01/20/2014 17:01:41 [0:2011]: writing usage file to "usage"
>>> 01/20/2014 17:01:41 [0:2011]: no tasker to notify
>>> 01/20/2014 17:01:41 [0:2011]: no epilog script to start
>>> </shepherd_trace>
>>> 
>>> 
>>> Thanks,
>>> 
>>> Ian
>>> 
>>> 
>>> 
>>> On Mon, 20 Jan 2014 16:12:26 -0000, Reuti <[email protected]> 
>>> wrote:
>>> 
>>>> Am 20.01.2014 um 17:08 schrieb Ian Johnson:
>>>> 
>>>>> Reuti,
>>>>> 
>>>>> Inline...
>>>>> 
>>>>> On Mon, 20 Jan 2014 14:43:51 -0000, Reuti <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Am 20.01.2014 um 13:04 schrieb Ian Johnson:
>>>>>> 
>>>>>>> Reuti,
>>>>>>> 
>>>>>>> The directory /opt/capitati/smartmate_data/test is now writable by the 
>>>>>>> smartmate user. Sorry, this was causing the 26 exit status. I'm back to 
>>>>>>> the exit status 11 again. Now, both the o and e files opened in the 
>>>>>>> /opt/capitati/smartmate_data/test directory but are of zero length.
>>>>>> 
>>>>>> is the directory in question mounted by autofs/Automounter or a hard NFS 
>>>>>> mount of the user smartmate_data?
>>>>> 
>>>>> This is mounted at boot-time from the fstab file.
>>>>> 
>>>>>> 
>>>>>> - Any prolog on a queue or global level?
>>>>> 
>>>>> No prolog on any queue or global.
>>>>> 
>>>>>> - What user:group has the created (empty) file?
>>>>> 
>>>>> smartmate:smartmate
>>>>> 
>>>>>> - How are the users/IDs distributed to the nodes?
>>>>> 
>>>>> Users and groups are created locally on each exec node and the master 
>>>>> node. User and group names have identical IDs.
>>>>> 
>>>>> 
>>>>> This problem seems to seem from the fact that the smartmate user is *not* 
>>>>> a superuser. I think it's a problem when the UID and GID are changed in 
>>>>> Shepherd in order to run the job script.
>>>> 
>>>> But in principle "smartmate" can write at this location?
>>>> 
>>>> To investigate further, you could define a small prolog running as a) root 
>>>> and then b) smartmate and write something. Does it show the same behavior?
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Ian
>>>>> 
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
>>>>>> 
>>>>>>> The spool directory is in /opt/capitati/ge2011.11/smartmate/spool which 
>>>>>>> is owned by root:root.
>>>>>>> 
>>>>>>> Could you guess as to where the shepherd code is failing using the 
>>>>>>> trace logs I sent last week? I've been looking through the shepherd 
>>>>>>> code but I can't see anything obvious.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Ian
>>>>>>> 
>>>>>>> On Mon, 20 Jan 2014 11:52:46 -0000, Reuti <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Am 20.01.2014 um 12:11 schrieb Ian Johnson:
>>>>>>>> 
>>>>>>>>> Reuti,
>>>>>>>>> 
>>>>>>>>> I have changed the qsub options to write stdout and stdout to an NFS 
>>>>>>>>> mounted directory, and the job script is still not being executed. 
>>>>>>>>> Now the job is exiting, according to the shepherd trace, with exit 
>>>>>>>>> status 26. This time no files o and e files are created.
>>>>>>>> 
>>>>>>>> The path /opt/capitati/smartmate_data/test/job_sm_out.log is writable 
>>>>>>>> (for the user) on the node and all directories in the path exist?
>>>>>>>> 
>>>>>>>> BTW: Is the spool directoty local on each host (preferable) or in a 
>>>>>>>> shared /opt/capitati/?
>>>>>>>> 
>>>>>>>> -- Reuti
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> What does exit status 26 mean? And given the previous behaviour on a 
>>>>>>>>> local disk (job exit status 11), can you think of anything that is 
>>>>>>>>> preventing the non-superuser from executing jobs on execution nodes? 
>>>>>>>>> This is turning into a critical bug for us.
>>>>>>>>> 
>>>>>>>>> Thanks for your continued help,
>>>>>>>>> 
>>>>>>>>> Ian
>>>>>>>>> 
>>>>>>>>> <shepherd_trace>
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: shepherd called with uid = 0, euid = 0
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: starting up 2011.11
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: setpgid(1486, 1486) returned 0
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: do_core_binding: "binding" parameter 
>>>>>>>>> not found in config file
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: no prolog script to start
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: parent: forked "job" with pid 1487
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: child: starting son(job, 
>>>>>>>>> /opt/capitati/ge2011.11/smartmate/spool/exec-1/job_scripts/34, 0);
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: pid=1487 pgrp=1487 sid=1487 old 
>>>>>>>>> pgrp=1486 getlogin()=<no login set>
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: parent: job-pid: 1487
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: reading passwd information for user 
>>>>>>>>> 'smartmate'
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: setosjobid: uid = 0, euid = 0
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: setting limits
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_CPU setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_FSIZE setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_DATA setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_STACK setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_CORE setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_VMEM/RLIMIT_AS setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: RLIMIT_RSS setting: (soft 
>>>>>>>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) 
>>>>>>>>> resulting: (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> 18446744073709551615(INFINITY))
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: setting environment
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: Initializing error file
>>>>>>>>> 01/20/2014 11:02:12 [0:1487]: switching to intermediate/target user
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: wait3 returned 1487 (status: 6656; 
>>>>>>>>> WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 26)
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: job exited with exit status 26
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: reaped "job" with pid 1487
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: job exited not due to signal
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: job exited with status 26
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: now sending signal KILL to pid -1487
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: writing usage file to "usage"
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: no tasker to notify
>>>>>>>>> 01/20/2014 11:02:12 [0:1486]: no epilog script to start
>>>>>>>>> </shepherd_trace>
>>>>>>>>> 
>>>>>>>>> <job_script>
>>>>>>>>> #!/bin/bash
>>>>>>>>> #
>>>>>>>>> #$ -j y
>>>>>>>>> #$ -o /opt/capitati/smartmate_data/test/job_sm_out.log
>>>>>>>>> #$ -e /opt/capitati/smartmate_data/test/job_sm_err.log
>>>>>>>>> #$ -S /bin/bash
>>>>>>>>> 
>>>>>>>>> echo "Hello World"
>>>>>>>>> echo `date`
>>>>>>>>> </job_script>
>>>>>>>>> 
>>>>>>>>> Ian Johnson
>>>>>>>>> Software Engineer
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Capita Translation and Interpreting
>>>>>>>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel 
>>>>>>>>> (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>>>>>> | [email protected] | Skype ID: ian.johnson_als
>>>>>>>>> www.capitatranslationinterpreting.com
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 14 January 2014 18:34, Reuti <[email protected]> wrote:
>>>>>>>>> Am 14.01.2014 um 18:27 schrieb Ian Johnson:
>>>>>>>>> 
>>>>>>>>> > Reuti,
>>>>>>>>> >
>>>>>>>>> > There's no file staging installed. The job script is being copied 
>>>>>>>>> > to the execution host.
>>>>>>>>> 
>>>>>>>>> Correct (for the job script itself).
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> > The output file *is* being opened in ~smartmate but it is of zero 
>>>>>>>>> > length.
>>>>>>>>> 
>>>>>>>>> I would assume that they is not created at all in this location, only 
>>>>>>>>> on the nodes. Or do you mean the home directory on the nodes?
>>>>>>>>> 
>>>>>>>>> NB: In Torque there is a file staging for the .o/.e files, but not in 
>>>>>>>>> SGE.
>>>>>>>>> 
>>>>>>>>> -- Reuti
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> > Thanks,
>>>>>>>>> >
>>>>>>>>> > Ian
>>>>>>>>> >
>>>>>>>>> > On Tue, 14 Jan 2014 17:18:06 -0000, Reuti 
>>>>>>>>> > <[email protected]> wrote:
>>>>>>>>> >
>>>>>>>>> >> Am 14.01.2014 um 18:04 schrieb Ian Johnson:
>>>>>>>>> >>
>>>>>>>>> >>> Reuti,
>>>>>>>>> >>>
>>>>>>>>> >>> There is no output from the script at all in the 
>>>>>>>>> >>> ~smartmate/job.sh.o[0-9]+ files. The home directory of the 
>>>>>>>>> >>> smartmate user is local disk. However, grid engine is installed 
>>>>>>>>> >>> on an NFS share.
>>>>>>>>> >>
>>>>>>>>> >> Do you have any file staging installed? Otherwise the output will 
>>>>>>>>> >> not be send to the real home directory of the user. Also the input 
>>>>>>>>> >> files could be missing on the execution host.
>>>>>>>>> >>
>>>>>>>>> >> -- Reuti
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>> Is there other information you require? Is there any way to get 
>>>>>>>>> >>> the function call that is failing in shepherd, e.g. more verbose 
>>>>>>>>> >>> tracing?
>>>>>>>>> >>>
>>>>>>>>> >>> Thanks,
>>>>>>>>> >>>
>>>>>>>>> >>> Ian
>>>>>>>>> >>>
>>>>>>>>> >>> On Tue, 14 Jan 2014 15:19:34 -0000, Reuti 
>>>>>>>>> >>> <[email protected]> wrote:
>>>>>>>>> >>>
>>>>>>>>> >>>> Hi,
>>>>>>>>> >>>>
>>>>>>>>> >>>> Am 14.01.2014 um 15:19 schrieb Ian Johnson:
>>>>>>>>> >>>>
>>>>>>>>> >>>>> I have a simple job, which echoes `date` to stdout, that I'm 
>>>>>>>>> >>>>> using to test an Open Grid Engine installation. Running qsub as 
>>>>>>>>> >>>>> root the job is run successfully. However, using another 
>>>>>>>>> >>>>> non-superuser, in this case smartmate user, the output from 
>>>>>>>>> >>>>> qacct -j says that the job has exited with exit status 11. The 
>>>>>>>>> >>>>> shepherd trace confirms this (see below).
>>>>>>>>> >>>>
>>>>>>>>> >>>> Do you have any output? 11 means "Resource temporarily 
>>>>>>>>> >>>> unavailable", which could mean it can't write to the (mounted?) 
>>>>>>>>> >>>> home directory of the user. How is it mount configured?
>>>>>>>>> >>>>
>>>>>>>>> >>>> AFAICS the user is known, as otherwise you would face a 
>>>>>>>>> >>>> different error.
>>>>>>>>> >>>>
>>>>>>>>> >>>> -- Reuti
>>>>>>>>> >>>>
>>>>>>>>> >>>>
>>>>>>>>> >>>>> Would anyone have an idea as to what is going on? Thank you.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> <shepherd_trace>
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: shepherd called with uid = 0, 
>>>>>>>>> >>>>> euid = 0
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: starting up 2011.11
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: setpgid(2723, 2723) returned 0
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: do_core_binding: "binding" 
>>>>>>>>> >>>>> parameter not found in config file
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: no prolog script to start
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: parent: forked "job" with pid 2724
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: child: starting son(job, 
>>>>>>>>> >>>>> /opt/capitati/ge2011.11/smartmate/spool/exec-1/job_scripts/32, 
>>>>>>>>> >>>>> 0);
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: pid=2724 pgrp=2724 sid=2724 old 
>>>>>>>>> >>>>> pgrp=2723 getlogin()=root
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: parent: job-pid: 2724
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: reading passwd information for 
>>>>>>>>> >>>>> user 'smartmate'
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: setosjobid: uid = 0, euid = 0
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: setting limits
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_CPU setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_FSIZE setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_DATA setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_STACK setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_CORE setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_VMEM/RLIMIT_AS setting: 
>>>>>>>>> >>>>> (soft 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: RLIMIT_RSS setting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY)) resulting: (soft 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY), hard 
>>>>>>>>> >>>>> 18446744073709551615(INFINITY))
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: setting environment
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: Initializing error file
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2724]: switching to intermediate/target 
>>>>>>>>> >>>>> user
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: wait3 returned 2724 (status: 
>>>>>>>>> >>>>> 2816; WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 11)
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: job exited with exit status 11
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: reaped "job" with pid 2724
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: job exited not due to signal
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: job exited with status 11
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: now sending signal KILL to pid 
>>>>>>>>> >>>>> -2724
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: writing usage file to "usage"
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: no tasker to notify
>>>>>>>>> >>>>> 01/14/2014 14:08:56 [0:2723]: no epilog script to start
>>>>>>>>> >>>>> </shepherd_trace>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> <job_script>
>>>>>>>>> >>>>> #!/bin/bash
>>>>>>>>> >>>>> #
>>>>>>>>> >>>>> #$ -j y
>>>>>>>>> >>>>> #
>>>>>>>>> >>>>> #$ -S /bin/bash
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> echo "Hello World"
>>>>>>>>> >>>>> echo `date`
>>>>>>>>> >>>>> </job_script>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Ian Johnson
>>>>>>>>> >>>>> Software Engineer
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Capita Translation and Interpreting
>>>>>>>>> >>>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | 
>>>>>>>>> >>>>> Tel (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>>>>>> >>>>> | [email protected] | Skype ID: ian.johnson_als
>>>>>>>>> >>>>> www.capitatranslationinterpreting.com
>>>>>>>>> >>>>> _______________________________________________
>>>>>>>>> >>>>> users mailing list
>>>>>>>>> >>>>> [email protected]
>>>>>>>>> >>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>>> >>>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>> --
>>>>>>>>> >>> Kind regards,
>>>>>>>>> >>>
>>>>>>>>> >>> Ian Johnson
>>>>>>>>> >>> Software Engineer
>>>>>>>>> >>>
>>>>>>>>> >>> Capita Translation and Interpreting
>>>>>>>>> >>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel 
>>>>>>>>> >>> (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>>>>>> >>> | [email protected] | Skype ID: ian.johnson_als
>>>>>>>>> >>> www.capitatranslationinterpreting.com
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Kind regards,
>>>>>>>>> >
>>>>>>>>> > Ian Johnson
>>>>>>>>> > Software Engineer
>>>>>>>>> >
>>>>>>>>> > Capita Translation and Interpreting
>>>>>>>>> > Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel 
>>>>>>>>> > (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>>>>>> > | [email protected] | Skype ID: ian.johnson_als
>>>>>>>>> > www.capitatranslationinterpreting.com
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Kind regards,
>>>>>>> 
>>>>>>> Ian Johnson
>>>>>>> Software Engineer
>>>>>>> 
>>>>>>> Capita Translation and Interpreting
>>>>>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): 
>>>>>>> +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>>>> | [email protected] | Skype ID: ian.johnson_als
>>>>>>> www.capitatranslationinterpreting.com
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Kind regards,
>>>>> 
>>>>> Ian Johnson
>>>>> Software Engineer
>>>>> 
>>>>> Capita Translation and Interpreting
>>>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): 
>>>>> +44 845 367 7000 | Tel (US): +1 (800) 579-5010
>>>>> | [email protected] | Skype ID: ian.johnson_als
>>>>> www.capitatranslationinterpreting.com
>>>> 
>>> 
>>> 
>>> --
>>> Kind regards,
>>> 
>>> Ian Johnson
>>> Software Engineer
>>> 
>>> Capita Translation and Interpreting
>>> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 
>>> 845 367 7000 | Tel (US): +1 (800) 579-5010
>>> | [email protected] | Skype ID: ian.johnson_als
>>> www.capitatranslationinterpreting.com
>> 
> 
> 
> -- 
> Kind regards,
> 
> Ian Johnson
> Software Engineer
> 
> Capita Translation and Interpreting
> Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 
> 845 367 7000 | Tel (US): +1 (800) 579-5010
> | [email protected] | Skype ID: ian.johnson_als
> www.capitatranslationinterpreting.com


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to