[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Jesse Glick assigned an issue to Unassigned Jenkins / JENKINS-56735 Builds hanging after pod start in version 1.14.9 Change By: Jesse Glick Assignee: Carlos Sanchez Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.198376.1553505622000.12779.1563306201292%40Atlassian.JIRA. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Jesse Glick updated an issue Jenkins / JENKINS-56735 Builds hanging after pod start in version 1.14.9 Change By: Jesse Glick Labels: jenkins kuberenetes-plugin kubernetes plugin Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.198376.1553505622000.26265.1560368043151%40Atlassian.JIRA. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Karol Gil commented on JENKINS-56735 Re: Builds hanging after pod start in version 1.14.9 We thought so and we tested fixed version 1.15.5 last week. Unfortunately, we still a lot of Interrupted while waiting for websocket connection, you should increase the Max connections to Kubernetes API issues in logs. We reverted to 1.14.9 again and everything works smoothly once again. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.198376.1553505622000.19735.1559584560381%40Atlassian.JIRA. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Jesse Glick commented on JENKINS-56735 Re: Builds hanging after pod start in version 1.14.9 Duplicate of JENKINS-55392 perhaps? Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.198376.1553505622000.19677.1559578740217%40Atlassian.JIRA. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Karol Gil updated an issue Jenkins / JENKINS-56735 Builds hanging after pod start in version 1.14.9 Change By: Karol Gil We updated kubernetes plugin from version 1.14.3 to version 1.14.9. On version 1.14.3 everything was running smoothly and after upgrade to 1.14.9 when Jenkins is under heavy load (starting/running ~100 jobs/pods) we observe that builds are stuck right after pod start or after Git checkout (first step of our test pipelines).We have a step timeout after 30 minutes and those jobs that were stuck could not be killed and was stuck with the following log: {code:java} > git checkout -f 073a008e8e8fdad44c3d637a8b9e5995277724aeCancelling nested steps due to timeoutBody did not finish within grace period; terminating with extreme prejudice{code}Only hard kill with calling {{POST BUILD_URL/kill}} did stop the build.What is interesting, sometimes those builds did fail and asked for changing max connections to k8s API{code:java}Caused: java.io.IOException: Interrupted while waiting for websocket connection, you should increase the Max connections to Kubernetes API{code}We increased the setting gradually to up to 6 (sic!) and it did not solve the issue in any way. On other Jenkins instances running under not so heavy load there is no sign of the issue whatsoever. Can it be somehow related to expiring kubernetes clients introduced in 1.14.5?I'm guessing that may be the case, because everything is working smoothly after Jenkins restart for some time (~24 hours usually, but we had shorter time frames as well) and then everything for stuck. After killing all jobs and restarting Jenkins everything went back to normal - again for finite time. Downgrade to version 1.14.3 solved all the issues.Please let me know if any additional information should be provided. Add Comment
[JIRA] (JENKINS-56735) Builds hanging after pod start in version 1.14.9
Title: Message Title Karol Gil created an issue Jenkins / JENKINS-56735 Builds hanging after pod start in version 1.14.9 Issue Type: Bug Assignee: Carlos Sanchez Components: kubernetes-plugin Created: 2019-03-25 09:20 Environment: EKS version 1.11.8 Jenkins version 2.164.1 Kubernetes Plugin version 1.14.9 Java version 8 Jenkins installation from Helm chart Jenkins running with following JVM args: -Dkubernetes.websocket.ping.interval=3 -Dkubernetes.websocket.timeout=1 These were introduced due to reoccurring socket timeouts and did solve the issue. Labels: kuberenetes-plugin kubernetes plugin jenkins Priority: Major Reporter: Karol Gil We updated kubernetes plugin from version 1.14.3 to version 1.14.9. On version 1.14.3 everything was running smoothly and after upgrade to 1.14.9 when Jenkins is under heavy load (starting/running ~100 jobs/pods) we observe that builds are stuck right after pod start or after Git checkout (first step of our test pipelines). We have a step timeout after 30 minutes and those jobs that were stuck could not be killed and was stuck with the following log: > git checkout -f 073a008e8e8fdad44c3d637a8b9e5995277724ae Cancelling nested steps due to timeout Body did not finish within grace period; terminating with extreme prejudice