On 17 May 2015, at 13:36, Vijay Bellur <[email protected]> wrote: > On 05/17/2015 02:32 PM, Vijay Bellur wrote: >> [Adding gluster-devel] >> On 05/16/2015 11:31 PM, Niels de Vos wrote: >>> On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote: >>>> It seems that many failures of the regression tests (at least for >>>> NetBSD) are caused by failing to reconnect to the slave. Jenkins tries >>>> to keep a control connection open to the slaves, and reconnects when the >>>> connection terminates. >>>> >>>> I do not know why the connection is disrupted, but I can see that >>>> Jenkins is not able to resolve the hostname of the slave. For example, >>>> from (well, you have to find the older logs, Jenkins seems to have >>>> automatically reconnected) >>>> http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log : >>>> >>>> java.io.IOException: There was a problem while connecting to >>>> nbslave71.cloud.gluster.org:22 >>>> ... >>>> Caused by: java.net.UnknownHostException: >>>> nbslave71.cloud.gluster.org: Name or service not known >>>> >>>> >>>> The error in the console log of the regression test is less helpful, it >>>> only states the disconnection failure: >>>> >>>> >>>> http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console >>>> >>> >>> In fact, this looks very much related to these reports: >>> >>> - https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879 >>> - https://issues.jenkins-ci.org/browse/JENKINS-18879 >>> >>> This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade >>> Jenkins too? >> >> Yes, I have started an upgrade. Please expect a downtime for Jenkins >> during the upgrade. >> >> I will update once the activity is complete. >> > > Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely > doing fine. Several plugins of Jenkins have also been updated to their latest > versions. During the course of the upgrade, I noticed that we were using the > deprecated 'gerrit approve' interface to intimate status of a smoke run. Have > changed that to use 'gerrit review' and this seems to have addressed the > problem of smoke tests not reporting status back to gerrit. > > There were a few instances of Jenkins not being able to launch slaves through > ssh but was later successful upon automatic retries. We will need to watch > this behavior to see if this problem persists and comes in the way of normal > functioning. > > Manu - can you please verify and report back if the NetBSD slaves work better > with the upgraded Jenkins master? > > All - please drop a note on gluster-infra if you happen to notice problems > with Jenkins.
Good stuff. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift _______________________________________________ Gluster-devel mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-devel
