Hi, I've never seen this but I would start with: 1) strace qmaster during restart to try to see at which point it is dying (e.g., loading a config file) 2) look for any reference to the name of the host you deleted in the spool area and do some cleanup 3) clean out the jobs spool area
HTH, John On Sat, 2018-11-10 at 16:23 -0500, Daniel Povey wrote: Has anyone found this error, and managed to fix it? I am in a very difficult situation. I deleted a host (qconf -de hostname) thinking that the machine no longer existed, but it did exist, and there was a job in 'dr' state there. After I attempted to force-delete that job (qdel -f job-id), the queue master died with out-of-memory, and now I can't restart qmaster. So now I don't know hw to fix it. Am I just completely lost now? Dan _______________________________________________ users mailing list users@gridengine.org<mailto:users@gridengine.org> https://gridengine.org/mailman/listinfo/users
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users