On 12/29/15 8:22 PM, Andrew Bogott wrote:
On 12/29/15 7:01 PM, Bryan White wrote:
nothing of mine has run on the queue for ~90 minutes.
Output of 'qstat -f'
error: commlib error: got select error (Connection refused)
error: unable to send message to qmaster using port 6444 on host
"tools-grid-master.tools.eqiad.wmflabs": got send error
12000 or so jobs were scheduled over the course of about 90 minutes
and the grid is overwhelmed -- we're working on untangling the mess.
Oops, my mistake, that was 12000000 jobs.
Bryan
_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l
_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l