Resolved this. One host had bad entry for its own hostname in /etc/hosts.

-Bill
On Apr 1, 2011, at 11:41 AM, William Deegan wrote:

> Greetings,
> 
> I noticed one node (using Chris's ge-8.0.0.alph binaries) wouldn't take:
> qsh -l hostname=this_host
> 
> So I stopped the execd via:
> /etc/init.d/sgeexecd.BLAH stop
> 
> Then tried 
> /etc/init.d/sgeexecd.BLAH start
> 
> And it just sits there.
> 
> First time I did this I saw:
> 04/01/2011 10:26:05|  main|nafta9|W|can't register at qmaster 
> "hotan2.oasys-ds.com": abort qmaster registration due to communication errors
> 04/01/2011 10:26:05|  main|nafta9|E|commlib error: got select error 
> (Connection refused)
> 
> Killed it, and tried to start it again.
> Just hangs.
> 
> Any ?
> 
> -Bill


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to