Re: Multiple Pipelines in the queue using same executors and one goes offline

'Brownjay' via Jenkins Users Thu, 15 Feb 2018 12:38:24 -0800


Wanted to expand and sort of answer the questions I asked here.


   - One solution, which is probably the best solution, is to dynamically 
   add more A and B resources. If you can spin up new resources then you're 
   guaranteed to have online, not running nodes which Jenkins can then just 
   take to use in the pipeline. For me, though, nodes A and B are actual 
   hardware so they aren't something I can dynamically add or remove, per se.
   - Another is to put a timeout around the whole pipeline so it's always 
   capped. Then, when the node goes offline, the pipeline won't wait 
   indefinitely and will die once the timeout is reached.
   - The way I'm going to do it will be to have a while loop and to add or 
   remove any online or offline nodes every loop, respectively, and check if 
   there are ANY nodes that are busy (running). If there are any busy nodes 
   then I'll just wait for some time and attempt the loop again. I'll probably 
   cap it at some max runtime, too.
   - Another way to do it, I think, is to do as above and use a while loop 
   but instead of waiting for ALL resources to be available you could probably 
   just create the branches map for any resources that are available (online 
   and not running) at the time of the loop and just run those with parallel 
   repeatedly until there is nothing left to run. You'd have to wait for each 
   parallel to complete in this case, though, before you could attempt the 
   next set of parallel runs (as far as I know).


On Monday, February 12, 2018 at 7:06:39 PM UTC-7, Brownjay wrote:
>
> Hi,
>
> I feel like this is certainly a situation someone has run into before but 
> I haven't been able to think up a non-trivial solution for my use case:
>
>    - Lets say I have two pipelines, P1 and P2, that do stuff and both 
>    need the same two executors/slaves/nodes A and B
>    - P1 enters the queue first and takes both A and B and starts doing 
>    it's stuff in parallel (using parallel)
>    - Then P2 enters the queue while A and B are being used, still, by P1; 
>    A and B are still online but they're unavailable for P2 to use (you'd see 
>    it waiting for the executors to be available in the console, for example)
>    - However, B fails and has the node taken offline on failure
>    - P1 completes with A eventually passing and B failed and took the 
>    node offline.  Perhaps, depending on timing, P2 started using A when it 
>    passed.
>    - P2 is now waiting forever until node B is brought online again
>
> How can one check in the pipeline P2 that node B is offline and just break 
> out?  
>
> If the node was offline at the start of P2 then it's easy to check and 
> exclude it.  However, if B is online when P2 enters the queue and sets up 
> the parallel runs of A and B and sits and waits for them to be available 
> and one of them goes offline then how does the pipeline get notified it's 
> offline and move on to do whatever?  Doesn't seem to happen automatically 
> and I can't figure out how to check inside a node block that itself is 
> offline (B checking if it itself is offline).
>
> Here's a simple pipeline groovy script I made to help me figure out the 
> issue:
>
> // Branches for parallel node runs
> def branches = [:]
> // Nodes.  In real setup this would only contain nodes that are online at 
> the time the pipeline runs
> def node_names = ["A", "B"]
> // Short sleep time of 15 seconds.  Later, it'll get reduced to 5 if the 
> node name is B
> def sleep_time = 15
>
> // Loop through the nodes and create the data in the branch list to run in 
> parallel on at the end
> node_names.each { node_name ->
>     println node_name
>
>     branches["node_" + node_name] = {
>         //  Doing something like this doesn't do anything
>         //if (!isNodeOnline(node_name)) {
>         //    println "node name " + node_name + " is offline, returning"
>         //    return
>         //}
>
>         node(node_name) {
>             // If the node that is being looked at is B then set the sleep 
> time to 5 so that it runs
>             // a shorter time than A.  Later, it's hardcoded to fail B and 
> take it offline.  This way
>             // A stays in the queue running and B is done and offline.
>             def temp_sleep_time = sleep_time
>             if (node_name == "B") {
>                 temp_sleep_time = 5
>             }
>
>             timestamps {
>                 stage("pre-build") {
>                     println "Prebuilding " + node_name + "!"
>                     sleep time: temp_sleep_time, unit: 'SECONDS'
>                     println "Done with pre-build!"
>                 }
>                 stage("build") {
>                     println "Building " + node_name + "!"
>                     sleep time: temp_sleep_time, unit: 'SECONDS'
>                     println "Done with build!"
>                 }
>                 stage("post-build") {
>                     println "Post building " + node_name + "!"
>                     if (node_name == "B") {
>                         println "Taking node offline and failing build!"
>                         takeNodeOffline("Derp", "B")
>                         currentBuild.result = "FAILED"
>                         done = true
>                         return
>                     }
>                     sleep time: temp_sleep_time, unit: 'SECONDS'
>                     println "Done with post-build!"
>                 }
>             }
>         }
>     }
> }
> parallel branches
>
> Is this possible? Am I missing something obvious?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/2556dbce-5083-43b0-9013-c29189100b87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Multiple Pipelines in the queue using same executors and one goes offline

Reply via email to