Cron jobs to the queue may not have started. You may want to check yours. My cron jobs did not start up on the queue automatically. When the load average for the 15 queue machines were below 1.00 except for two, I have a feeling many other people's jobs didn't start up either. (ten are now above 1.00). I do have to say thank you... A daily job that normally takes 3 hours only took 40 minutes this time. I like it when other people's jobs aren't running :)
I also noticed I had dead jobs from July and August on there. They definitely weren't there before. The one job I had running before the troubles started was also dead (log files stopped being updated). I deleted the dead jobs, manually started one and the cron jobs came to life. So, I don't don't if removing the dead ones or manually starting one got things going. Bryan On Sat, Oct 11, 2014 at 10:42 AM, Marc A. Pelletier <[email protected]> wrote: > On 10/11/2014 12:13 AM, Bryan White wrote: > > The queue seems to be dead. No cron jobs have started for at least 4 > > hours. Anything I try, I receive: > > There were two corrupt entires in the job database, one of which > outright /killed/ the gridengine master. I was able to purge both > entries, and things are back to normal. > > That said, I have no idea how those entires got corrupted to begin with; > so I'll be keeping a close eye on things. > > -- Marc > > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
