[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title stephenconnolly commented on JENKINS-30014 Re: Builds hang, Jenkins still browseable I'm looking at "jenkins.util.Timer [#7]" Id=42 Group=main TIMED_WAITING on java.util.concurrent.FutureTask@29799b77 at sun.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.FutureTask@29799b77 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422) at java.util.concurrent.FutureTask.get(FutureTask.java:199) at com.microsoftopentechnologies.azure.util.ExecutionEngine.executeWithRetry(ExecutionEngine.java:31) at com.microsoftopentechnologies.azure.AzureCloudRetensionStrategy.check(AzureCloudRetensionStrategy.java:74) at com.microsoftopentechnologies.azure.AzureCloudRetensionStrategy.check(AzureCloudRetensionStrategy.java:32) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:70) at hudson.model.Queue._withLock(Queue.java:1286) at hudson.model.Queue.withLock(Queue.java:1169) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:61) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:51) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Number of locked synchronizers = 2 - java.util.concurrent.locks.ReentrantLock$NonfairSync@b996b1b - java.util.concurrent.ThreadPoolExecutor$Worker@c28941a As the culprit. Which leads me to https://github.com/jenkinsci/azure-slave-plugin/blob/5202aa9a54d3a04b9ce6c3485fa3626ea9decd69/src/main/java/com/microsoftopentechnologies/azure/AzureCloudRetensionStrategy.java#L40 And oh look another cloud retention strategy that pays no attention to locking correctly... probably should be something closer to https://github.com/jenkinsci/mansion-cloud-plugin/blob/master/src/main/java/com/cloudbees/jenkins/plugins/mtslavescloud/MansionRetentionStrategy.java or https://github.com/jenkinsci/docker-plugin/blob/master/docker-plugin/src/main/java/com/nirima/jenkins/plugins/docker/strategy/DockerCloudRetentionStrategy.java In any case the current implementation is blocking while it cleans out the slave - which is just plain wrong - so more than likely not a deadlock per se but actually a timeout in azure requests and all builds are blocked until that operation succeeds (of course likely there are slaves being spun up and torn down in the background as the queue is not making progress and the result of that is that you have an effective livelock helped by an external system that is slow to respond.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah commented on JENKINS-30014 Re: Builds hang, Jenkins still browseable Daniel Beck stephenconnolly Hi guys, hate to ping it, but maybe this thread got lost somewhere. Any thoughts on where we should dig? Thanks. Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah edited a comment on JENKINS-30014 Re: Builds hang, Jenkins still browseable Thisseemstobetheculprit: HandlingPOST/job/Kibana/buildfrom10.0.0.7:RequestHandlerThread[#60]Id=3534Group=mainWAITINGonjava.util.concurrent.locks.ReentrantLock$NonfairSync@33e946f9ownedbyjenkins.util.Timer[#9]Id=43 Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah updated an issue Jenkins / JENKINS-30014 Builds hang, Jenkins still browseable Change By: Jay Kah Priority: Major Critical Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah commented on JENKINS-30014 Re: Builds hang, Jenkins still browseable This seems to be the culprit: Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah edited a comment on JENKINS-30014 Re: Builds hang, Jenkins still browseable Thisseemstobetheculprit:HandlingPOST/job/Kibana/buildfrom10.0.0.7:RequestHandlerThread[#60]Id=3534Group=mainWAITINGonjava.util.concurrent.locks.ReentrantLock$NonfairSync@33e946f9ownedbyjenkins.util.Timer[#9]Id=43 atsun.misc.Unsafe.park(NativeMethod) -waitingonjava.util.concurrent.locks.ReentrantLock$NonfairSync@33e946f9 atjava.util.concurrent.locks.LockSupport.park(LockSupport.java:186) atjava.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) atjava.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) atjava.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) atjava.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) atjava.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) athudson.model.Queue.schedule2(Queue.java:589) athudson.model.Queue.schedule2(Queue.java:712) atjenkins.model.ParameterizedJobMixIn.doBuild(ParameterizedJobMixIn.java:199) athudson.model.AbstractProject.doBuild(AbstractProject.java:1753) atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod) atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) atjava.lang.reflect.Method.invoke(Method.java:606) atorg.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) atorg.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) atorg.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96) atorg.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:121) atorg.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) atorg.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) atorg.kohsuke.stapler.Stapler.invoke(Stapler.java:876) atorg.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249) atorg.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) atorg.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) atorg.kohsuke.stapler.Stapler.invoke(Stapler.java:876) atorg.kohsuke.stapler.Stapler.invoke(Stapler.java:649) atorg.kohsuke.stapler.Stapler.service(Stapler.java:238) atjavax.servlet.http.HttpServlet.service(HttpServlet.java:848) atorg.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686) atorg.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494) athudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132) atorg.jenkinsci.plugins.modernstatus.ModernStatusFilter.doFilter(ModernStatusFilter.java:52) athudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129) athudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:123) atorg.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) athudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49) atorg.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) athudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) athudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) athudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) atjenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) athudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) atorg.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) athudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) atorg.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142) athudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Daniel Beck assigned an issue to stephenconnolly stephenconnolly Any idea? Could this be related to your queue fixes? Jenkins / JENKINS-30014 Builds hang, Jenkins still browseable Change By: Daniel Beck Assignee: stephenconnolly Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-30014) Builds hang, Jenkins still browseable
Title: Message Title Jay Kah created an issue Jenkins / JENKINS-30014 Builds hang, Jenkins still browseable Issue Type: Bug Assignee: Unassigned Attachments: jenkins-threaddump.txt Components: core Created: 18/Aug/15 8:12 PM Environment: DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION=Ubuntu 14.04.2 LTS 3.13.0-46-generic java version 1.7.0_79 OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2) OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode) Jenkins ver. 1.624 Priority: Major Reporter: Jay Kah At times, Jenkins jobs that are being performed on one of the slaves get stuck with estimated Time Remaining: NA in the Dashboard sidebar. However, when the job is accessed directly, the correct status is displayed (job finished). This happens for both successful and unsuccessful builds.