[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-04-06 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza updated  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-50405  
 
 
  runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
Change By: 
 Raul Arabaolaza  
 
 
Status: 
 In Review Resolved  
 
 
Resolution: 
 Fixed  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-04-06 Thread ty...@monkeypox.org (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 R. Tyler Croy commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 I believe this is safe to close up now  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-04-03 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 PR is merged and ATH is working again, see  here Also according to logs there are only three nodes used (as expected). One for linux, another for windows and the last one for ath, so it seems the contention issue is also fixed.   I am going to keep this in review two or three days just in case and close if no problems arise  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-04-02 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 PR#34 should fix the issue, basically what I have done is change ensureInNode so it takes a comma-separated list of labels and checks that all of those labels are present individually in the current node, if not it allocates a new node with the labels joined by "&&". Caveats, is still unable to deal with complex labels, but that functionality is not needed to run on the current infra as you can simply enclose the entire runATH call in a node("docker&") as the Jenkinsfile for core does, whith the changes in #34 no node allocation will be done at all and no node will be blocked waiting for another one   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-04-02 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 In the meanwhile, I can just make the ensureInNode accept a list of labels and check that all are present in the current node Andrew Bayer Do you believe an ensureInNode step able to deal with label expressions could be an interesting addition for workflow.durable-task-step plugin?  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-29 Thread andrew.ba...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Andrew Bayer commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 So it appears that ensureInNode in runATH.groovy doesn't handle label expressions (i.e., docker && highmem), just label atoms (i.e., highmem), since it's looking for the literal string docker && highmem in the NODE_LABELS environment variable...which is just a space-delimited list of the individual label atoms on the node. So, e.g., if the only two labels on the node are docker and highmem, then NODE_LABELS is docker highmem. Which obviously doesn't contain docker && highmem.  This doesn't create a deadlock per se, but it does double up the executor usage per run, with one nested within another. I'm not sure what the solution is, exactly - in this particular case, it's actually pretty simple - just switch to highmem alone, since that'll get you the same thing as docker && highmem, but some smarter logic for determining what node you're on and what node you want to be on would be handy. However, that probably involves diving into the core label logic to do parsing/comparing/etc, and that is not shared library material.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread scm_issue_l...@java.net (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 SCM/JIRA link daemon commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 Code changed in jenkins User: R. Tyler Croy Path: Jenkinsfile http://jenkins-ci.org/commit/jenkins/0ca03d89c7e3a2b7855965e79b84dac2c0052119 Log: Merge pull request #3371 from raul-arabaolaza/JENKINS-50405-Quick_fix JENKINS-50405 Run the entire thing in docker && highmem node Compare: https://github.com/jenkinsci/jenkins/compare/9f599911f612...0ca03d89c7e3  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread scm_issue_l...@java.net (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 SCM/JIRA link daemon commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 Code changed in jenkins User: Raul Arabaolaza Path: Jenkinsfile http://jenkins-ci.org/commit/jenkins/9f8b5d691e3d11d65625497a1b876e1d47c466d0 Log: JENKINS-50405 Run the entire thing in docker && highmem node  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza started work on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
Change By: 
 Raul Arabaolaza  
 
 
Status: 
 Open In Progress  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza updated  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-50405  
 
 
  runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
Change By: 
 Raul Arabaolaza  
 
 
Status: 
 In  Progress  Review  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread ty...@monkeypox.org (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 R. Tyler Croy updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-50405  
 
 
  runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
Change By: 
 R. Tyler Croy  
 
 
Component/s: 
 essentials  
 
 
Sprint: 
 Essentials - Milestone 1  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 So, after a talk with R. Tyler Croy we are not going to disable this yet, I am going to start to think in a better way to orchestrate nodes so I can minimize resource contention, seems like it has been properly running all week and the problem was triggered by a bunch of very quick merges into core  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread rarabaol...@cloudbees.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Raul Arabaolaza commented on  JENKINS-50405  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
 So, as a quick fix, I am going to create a quick PR to just do not run the ath for the moment. As a better solution, I have to find a way to liberate the "linux" node while is waiting for the docker ones if possible and if not just make sure the full runATH is executed on docker&  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-50405) runATH leads to deadlock of resource consumption for core PR builds

2018-03-26 Thread ty...@monkeypox.org (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 R. Tyler Croy created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-50405  
 
 
  runATH leads to deadlock of resource consumption for core PR builds   
 

  
 
 
 
 

 
Issue Type: 
  Bug  
 
 
Assignee: 
 Raul Arabaolaza  
 
 
Components: 
 acceptance-test-harness  
 
 
Created: 
 2018-03-26 14:28  
 
 
Priority: 
  Major  
 
 
Reporter: 
 R. Tyler Croy  
 

  
 
 
 
 

 
 This weekend we experienced a denial-of-service on ci.jenkins.io due to this resource contention caused by the runATH step in the core Jenkinsfile. Basically, an executor on the "linux" label was occupied while blocking and waiting for an executor on "docker&". When Jenkins couldn't provision "highmem" due to capacity issues, the runATH step blocks the "linux" executor indefinitely. At the bottom of the Jenkinsfile for core, is some code along these lines: 

 

node('linux') {
  /* some setup */
  runAth()
}
 

 In runATH(), the first ensureInNode statement ensure that the Pipeline only uses on node, since the execution is already in a "linux" NODE_LABEL. When the second ensureInNode executes, it's attempting to ensure that the execution is in docker&, which it is of course not. This causes Pipeline to block waiting for this node, while occupying the outer "linux" node declaration. This is kind of a big problem and will cause additional resource contention whenever more than one or two core PRs are merged in quick succession.