Am 02.04.2011 um 01:41 schrieb William Deegan: > Greetings, > > Here's what I did. > 1) unpack ge tarballs into /opt/ge on all hosts > 2) configure grid master > 3) scp /opt/ge/default to all hosts > 4) verify ssh works back and forth among all hosts as root
Do you need X11 forwarding? > 5) run ./start_gui_installer -debug > 6) Install all execution hosts There is nothing to install - well, besides adding the startup scripts to /etc/init.d by `chkconfig` or other means. Then add the exechosts as administrative hosts and start the execd on each of them. > This is shared nothing, so there are no filesystems shared among the systems. > > Are there any other configurations which I need to do? > > I did this a few months ago, but I'm wondering if I missed something this > time around. > > qrsh, and qlogin work for some of the hosts. > qsh works for most of the hosts. > > I'm seeing errors like this on the qmaster host: > 04/01/2011 16:22:48|schedu|qmasterhost|E|unable to find job 1197 from the > scheduler order package > 04/01/2011 16:23:03|schedu|qmasterhost |E|could not find job "1197" in master > list > 04/01/2011 16:23:03|schedu|qmasterhost |E|callback function for event "48. > EVENT DEL JOB 1197.1" failed > > And seeing messages like this on execution hosts: > 04/01/2011 16:06:49| main|exehost1|W|reaping job "1190" ptf complains: Job > does not exist > 04/01/2011 16:06:49| main|exehost1|E|can't open file > active_jobs/1190.1/error: No such file or directory The spool files are local then on all exechosts too? Standard location where the SGE owner (often sgeadmin or alike) is able to write? -- Reuti > Thanks, > Bill > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
