Hi, We've been tracking a bug in Gerrit recently where all of the threads tasked with servicing the stream-events command eventually get stuck. This causes all of the CI systems, including OpenStack's, to stop responding to events until the server is manually restarted.
We recently found that had happened with connections from the netapp-ci account. I believe that Gerrit should be more resilient to these kinds of errors, however, due to the severe impact to the project when this happens, I have disabled the netapp-ci account until we find a solution to the problem. Note that the Gerrit upgrade scheduled for Saturday May 9 will bring a new SSH server with it, and may have an impact on this issue. If the netapp-ci operators have a moment to chat with us in #openstack-infra on Freenode that would probably be the best way to work on a plan to debug the problem further. Thanks, and sorry for the inconvenience, Jim _______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
