On 12/29/15 8:22 PM, Andrew Bogott wrote:
On 12/29/15 7:01 PM, Bryan White wrote:
nothing of mine has run on the queue for ~90 minutes.

Output of 'qstat -f'
error: commlib error: got select error (Connection refused)
error: unable to send message to qmaster using port 6444 on host "tools-grid-master.tools.eqiad.wmflabs": got send error

12000 or so jobs were scheduled over the course of about 90 minutes and the grid is overwhelmed -- we're working on untangling the mess.

Oops, my mistake, that was 12000000 jobs.


Bryan


_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l


_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Reply via email to