Am 02.04.2011 um 01:41 schrieb William Deegan:

> Greetings,
> 
> Here's what I did.
> 1) unpack ge tarballs  into /opt/ge on all hosts
> 2) configure grid master 
> 3) scp /opt/ge/default to all hosts
> 4) verify ssh works back and forth among all hosts as root

Do you need X11 forwarding?


> 5) run ./start_gui_installer -debug
> 6) Install all execution hosts

There is nothing to install - well, besides adding the startup scripts to 
/etc/init.d by `chkconfig` or other means. Then add the exechosts as 
administrative hosts and start the execd on each of them.


> This is shared nothing, so there are no filesystems shared among the systems.
> 
> Are there any other configurations which I need to do?
> 
> I did this a few months ago, but I'm wondering if I missed something this 
> time around.
> 
> qrsh, and qlogin work for some of the hosts.
> qsh works for most of the hosts.
> 
> I'm seeing errors like this on the qmaster host:
> 04/01/2011 16:22:48|schedu|qmasterhost|E|unable to find job 1197 from the 
> scheduler order package
> 04/01/2011 16:23:03|schedu|qmasterhost |E|could not find job "1197" in master 
> list
> 04/01/2011 16:23:03|schedu|qmasterhost |E|callback function for event "48. 
> EVENT DEL JOB 1197.1" failed
> 
> And seeing messages like this on execution hosts:
> 04/01/2011 16:06:49|  main|exehost1|W|reaping job "1190" ptf complains: Job 
> does not exist
> 04/01/2011 16:06:49|  main|exehost1|E|can't open file 
> active_jobs/1190.1/error: No such file or directory

The spool files are local then on all exechosts too? Standard location where 
the SGE owner (often sgeadmin or alike) is able to write?

-- Reuti


> Thanks,
> Bill
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to