I am installing GE 8.0b for testing using binary (install_execd and
install_qmaster). Everything seemed going very smoothly. However when I
submit a test job (sleeper.sh), it didn't run.
I guess there is something I missed in configurations (I used most default
options when I installed).
Here is snippet of what in the log files: (master node is qmaster and also
execution)

Qmaster log:
10/07/2011 12:15:26|  main|master|I|read job database with 0 entries in 0
seconds
10/07/2011 12:15:26|  main|master|E|error opening file
"/usr/local/sge/default/spool/qmaster/./sharetree" for reading: No such file
or directory
10/07/2011 12:15:26|  main|master|I|qmaster hard descriptor limit is set to
8192
10/07/2011 12:15:26|  main|master|I|qmaster soft descriptor limit is set to
8192
10/07/2011 12:15:26|  main|master|I|qmaster will use max. 8172 file
descriptors for communication
10/07/2011 12:15:26|  main|master|I|qmaster will accept max. 99 dynamic
event clients
10/07/2011 12:15:26|  main|master|I|starting up SGE 8.0.0b (lx-amd64)
10/07/2011 12:22:57|worker|master|W|job 8.1 failed on host master invalid
execution state because: shepherd exited with exit status 127: invalid
execution state

Execution log
10/07/2011 12:22:38|  main|master|I|starting up SGE 8.0.0b (lx-amd64)
10/07/2011 12:22:57|  main|master|E|shepherd of job 8.1 exited with exit
status = 127
10/07/2011 12:22:57|  main|master|E|abnormal termination of shepherd for job
8.1: no "exit_status" file
10/07/2011 12:22:57|  main|master|E|can't open file active_jobs/8.1/error:
No such file or directory
10/07/2011 12:22:57|  main|master|E|can't open pid file
"active_jobs/8.1/pid" for job 8.1
10/07/2011 12:22:57|  main|master|E|can't open usage file
"active_jobs/8.1/usage" for job 8.1: No such file or directory
10/07/2011 12:22:57|  main|master|E|shepherd exited with exit status 127:
invalid execution state

Both qmaster and execd are running
root@master:/usr/local/sge/default/spool/qmaster[1113]> ps aguwx | grep sge
sgeadmin  7410  0.0  0.8 653128 34976 ?        Sl   12:15   0:00
/usr/local/sge/bin/lx-amd64/sge_qmaster
sgeadmin  7546  0.0  0.0 111580  2532 ?        Sl   12:22   0:00
/usr/local/sge/bin/lx-amd64/sge_execd
root      7669  0.0  0.0  61192   764 pts/1    S+   12:33   0:00 grep sge


Can someone help?
Thank you
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to