[JIRA] [remoting] (JENKINS-26947) Unattended wait in the remoting code

[email protected] (JIRA) Thu, 12 Feb 2015 14:36:52 -0800

Issue Type:	Bug
Assignee:	Unassigned
Attachments:	Dockerfile, launch.sh, stacktrace.txt
Components:	remoting
Created:	12/Feb/15 10:36 PM
Description:	I find a way to trigger a remoting problem using tcp fault injection with netem. I'm able to trigger this wait call at hudson.remoting.Request.call(Request.java:146): {{ while(response==null && !channel.isInClosed()) // I don't know exactly when this can happen, as pendingCalls are cleaned up by Channel, // but in production I've observed that in rare occasion it can block forever, even after a channel // is gone. So be defensive against that. wait(30*1000); }} When this wait is triggered, the running build is stuck and consumes a executor. It loops over and over on the wait. To reproduce, setup a SSH slave using the attached Dockerfile, and setup netem on the docker0 bridge like this: tc qdisc add dev docker0 root netem tc qdisc change dev docker0 root netem corrupt 1 Testing requires to run the job one time before configuring netem, as netem settings are applied to all network streams, it could fail while downloading Maven dependencies. I just launched a Maven build of a example project to trigger the problem. It might be a Maven specific problem... To remove netem settings, just run tc qdisc del dev docker0 root. I've attached the Dockerfile, the command I used to launch it and a threaddump of a Jenkins stuck master.
Environment:	Linux
Project:	Jenkins
Priority:	Minor
Reporter:	Yoann Dubreuil

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

[JIRA] [remoting] (JENKINS-26947) Unattended wait in the remoting code

Reply via email to