Hi,

I've never seen this but I would start with:
1) strace qmaster during restart to try to see at which point it is dying (e.g.,
loading a config file)
2) look for any reference to the name of the host you deleted in the spool
area and do some cleanup
3) clean out the jobs spool area

HTH,
John

On Sat, 2018-11-10 at 16:23 -0500, Daniel Povey wrote:
Has anyone found this error, and managed to fix it?
I am in a very difficult situation.
I deleted a host (qconf -de hostname) thinking that the machine no longer 
existed, but it did exist, and there was a job in 'dr' state there.
After I attempted to force-delete that job (qdel -f job-id), the queue master 
died with out-of-memory, and now I can't restart qmaster.

So now I don't know hw to fix it.  Am I just completely lost now?

Dan

_______________________________________________

users mailing list

users@gridengine.org<mailto:users@gridengine.org>

https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to