The sge_execd process randomly stops on several Windows 2008 R2 x64 exec nodes 
while running jobs sent from the qmaster.  The only message in the qmaster 
message file is a commlib error stating it lost connectivity and the exec node 
doesn't show any error in the message file.

My question is two fold.  First, why is it crashing and second is how can I 
have sge_execd automatically restart if it crashes?

I'm still trying to figure out how to get sge_execd to start automatically when 
the Windows exec node boots up.  Apparently this is a know problem, at least 
according to the members on the Oracle Grid Engien forums.

Thanks!
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to