Hi, all of them are mounting /home from an ldap-server outside. till
On 06/09/12 10:54, Reuti wrote: > Hi, > > Am 05.09.2012 um 21:50 schrieb Tillmann Stieger: > > >> thx, for your quick reply. Yes we have a little cluster with 4 servers. The >> problem only exists on this machine and also for all users. >> The only difference is that this particular machine is the master host. The >> other machines are just execution hosts and one of them is also a shadow >> master. Otherwise they are pretty much identical, software wise. >> > is this machine mounting /home in a different way than the other machines? Is > the home for all users inside the private cluster or all using some > fileserver available outside the cluster too? > > -- Reuti > > > >> till >> >> On 9/5/12 5:09 PM, Reuti wrote: >> >>> Hi, >>> >>> Am 05.09.2012 um 15:06 schrieb Tillmann Stieger: >>> >>> >>>> I have been searching for a solution to my problem for quite a while. But >>>> no luck so far. I hope that someone here can help me out. >>>> >>>> On one specific machine I always get the following error-message: >>>> >>> You mean you have a larger cluster and problem exists only on one machine, >>> and there for one or all users? >>> >>> Is this node in some wy differently configured than the others? >>> >>> -- Reuti >>> >>> >>> >>>> Job 6941 caused action: Job 6941 set to ERROR >>>> >>>> User = xxx >>>> Queue = all.q@xxx >>>> Start Time = <unknown> >>>> End Time = <unknown> >>>> failed opening input/output file:09/05/2012 12:48:00 [1111:12201]: can't >>>> stat() "/home/xxx" as stdout_path >>>> Shepherd trace: >>>> 09/05/2012 12:48:00 [1111:12200]: shepherd called with uid = 1111, euid = >>>> 1111 >>>> 09/05/2012 12:48:00 [1111:12200]: starting up 2011.11 >>>> 09/05/2012 12:48:00 [1111:12200]: warning: starting not as superuser >>>> (uid=1111) >>>> 09/05/2012 12:48:00 [1111:12200]: setpgid(12200, 12200) returned 0 >>>> 09/05/2012 12:48:00 [1111:12200]: do_core_binding: "binding" parameter not >>>> found in config file >>>> 09/05/2012 12:48:00 [1111:12200]: no prolog script to start >>>> 09/05/2012 12:48:00 [1111:12200]: parent: forked "job" with pid 12201 >>>> 09/05/2012 12:48:00 [1111:12200]: parent: job-pid: 12201 >>>> 09/05/2012 12:48:00 [1111:12201]: child: starting son(job, >>>> /opt/sge6_2/default/spool/crab2/job_scripts/6941, 0); >>>> 09/05/2012 12:48:00 [1111:12201]: pid=12201 pgrp=12201 sid=12201 old >>>> pgrp=12200 getlogin()=<no login set> >>>> 09/05/2012 12:48:00 [1111:12201]: reading passwd information for user 'xxx' >>>> 09/05/2012 12:48:00 [1111:12201]: setosjobid: uid = 1111, euid = 1111 >>>> 09/05/2012 12:48:00 [1111:12201]: setting limits >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_CPU setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_FSIZE setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_DATA setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_STACK setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_CORE setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_VMEM/RLIMIT_AS setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: RLIMIT_RSS setting: (soft >>>> 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) >>>> resulting: (soft 18446744073709551615(INFINITY), hard >>>> 18446744073709551615(INFINITY)) >>>> 09/05/2012 12:48:00 [1111:12201]: setting environment >>>> 09/05/2012 12:48:00 [1111:12201]: Initializing error file >>>> 09/05/2012 12:48:00 [1111:12201]: switching to intermediate/target user >>>> 09/05/2012 12:48:00 [1111:12201]: tried to change uid/gid without being >>>> root >>>> 09/05/2012 12:48:00 [1111:12201]: try running further with uid=1111 >>>> 09/05/2012 12:48:00 [1111:12201]: closing all filedescriptors >>>> 09/05/2012 12:48:00 [1111:12201]: further messages are in "error" and >>>> "trace" >>>> 09/05/2012 12:48:00 [1111:12201]: can't stat() "/home/xxx" as stdout_path: >>>> Permission denied KRB5CCNAME=none uid=1111 gid=100 100 >>>> 09/05/2012 12:48:00 [1111:12200]: wait3 returned 12201 (status: 6656; >>>> WIFSIGNALED: 0, WIFEXITED: 1, WEXITSTATUS: 26) >>>> 09/05/2012 12:48:00 [1111:12200]: job exited with exit status 26 >>>> 09/05/2012 12:48:00 [1111:12200]: reaped "job" with pid 12201 >>>> 09/05/2012 12:48:00 [1111:12200]: job exited not due to signal >>>> 09/05/2012 12:48:00 [1111:12200]: job exited with status 26 >>>> 09/05/2012 12:48:00 [1111:12200]: now sending signal KILL to pid -12201 >>>> 09/05/2012 12:48:00 [1111:12200]: no tasker to notify >>>> 09/05/2012 12:48:00 [1111:12200]: failed starting job >>>> 09/05/2012 12:48:00 [1111:12200]: no epilog script to start >>>> >>>> Shepherd error: >>>> 09/05/2012 12:48:00 [1111:12201]: can't stat() "/home/xxx" as stdout_path: >>>> Permission denied KRB5CCNAME=none uid=1111 gid=100 100 >>>> >>>> Shepherd pe_hostfile: >>>> xxx 1 all.q@xxx UNDEFINED >>>> >>>> >>>> >>>> Has someone experienced similar issues? >>>> >>>> I would really appreciate any advice.Thanks. >>>> till >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>>> >>> >> > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
