Hi, Am 05.09.2012 um 15:06 schrieb Tillmann Stieger:
> I have been searching for a solution to my problem for quite a while. But no > luck so far. I hope that someone here can help me out. > > On one specific machine I always get the following error-message: You mean you have a larger cluster and problem exists only on one machine, and there for one or all users? Is this node in some wy differently configured than the others? -- Reuti > Job 6941 caused action: Job 6941 set to ERROR > > User = xxx > Queue = all.q@xxx > Start Time = <unknown> > End Time = <unknown> > failed opening input/output file:09/05/2012 12:48:00 [1111:12201]: can't > stat() "/home/xxx" as stdout_path > Shepherd trace: > 09/05/2012 12:48:00 [1111:12200]: shepherd called with uid = 1111, euid = 1111 > 09/05/2012 12:48:00 [1111:12200]: starting up 2011.11 > 09/05/2012 12:48:00 [1111:12200]: warning: starting not as superuser > (uid=1111) > 09/05/2012 12:48:00 [1111:12200]: setpgid(12200, 12200) returned 0 > 09/05/2012 12:48:00 [1111:12200]: do_core_binding: "binding" parameter not > found in config file > 09/05/2012 12:48:00 [1111:12200]: no prolog script to start > 09/05/2012 12:48:00 [1111:12200]: parent: forked "job" with pid 12201 > 09/05/2012 12:48:00 [1111:12200]: parent: job-pid: 12201 > 09/05/2012 12:48:00 [1111:12201]: child: starting son(job, > /opt/sge6_2/default/spool/crab2/job_scripts/6941, 0); > 09/05/2012 12:48:00 [1111:12201]: pid=12201 pgrp=12201 sid=12201 old > pgrp=12200 getlogin()=<no login set> > 09/05/2012 12:48:00 [1111:12201]: reading passwd information for user 'xxx' > 09/05/2012 12:48:00 [1111:12201]: setosjobid: uid = 1111, euid = 1111 > 09/05/2012 12:48:00 [1111:12201]: setting limits > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_CPU setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_FSIZE setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_DATA setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_STACK setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_CORE setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_VMEM/RLIMIT_AS setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: RLIMIT_RSS setting: (soft > 18446744073709551615(INFINITY), hard 18446744073709551615(INFINITY)) > resulting: (soft 18446744073709551615(INFINITY), hard > 18446744073709551615(INFINITY)) > 09/05/2012 12:48:00 [1111:12201]: setting environment > 09/05/2012 12:48:00 [1111:12201]: Initializing error file > 09/05/2012 12:48:00 [1111:12201]: switching to intermediate/target user > 09/05/2012 12:48:00 [1111:12201]: tried to change uid/gid without being root > 09/05/2012 12:48:00 [1111:12201]: try running further with uid=1111 > 09/05/2012 12:48:00 [1111:12201]: closing all filedescriptors > 09/05/2012 12:48:00 [1111:12201]: further messages are in "error" and "trace" > 09/05/2012 12:48:00 [1111:12201]: can't stat() "/home/xxx" as stdout_path: > Permission denied KRB5CCNAME=none uid=1111 gid=100 100 > 09/05/2012 12:48:00 [1111:12200]: wait3 returned 12201 (status: 6656; > WIFSIGNALED: 0, WIFEXITED: 1, WEXITSTATUS: 26) > 09/05/2012 12:48:00 [1111:12200]: job exited with exit status 26 > 09/05/2012 12:48:00 [1111:12200]: reaped "job" with pid 12201 > 09/05/2012 12:48:00 [1111:12200]: job exited not due to signal > 09/05/2012 12:48:00 [1111:12200]: job exited with status 26 > 09/05/2012 12:48:00 [1111:12200]: now sending signal KILL to pid -12201 > 09/05/2012 12:48:00 [1111:12200]: no tasker to notify > 09/05/2012 12:48:00 [1111:12200]: failed starting job > 09/05/2012 12:48:00 [1111:12200]: no epilog script to start > > Shepherd error: > 09/05/2012 12:48:00 [1111:12201]: can't stat() "/home/xxx" as stdout_path: > Permission denied KRB5CCNAME=none uid=1111 gid=100 100 > > Shepherd pe_hostfile: > xxx 1 all.q@xxx UNDEFINED > > > > Has someone experienced similar issues? > > I would really appreciate any advice.Thanks. > till > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
