[JIRA] (JENKINS-61304) EC2 Plugin terminates instances temporarily offline

2020-04-22 Thread umberto.nicole...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Umberto Nicoletti commented on  JENKINS-61304  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: EC2 Plugin terminates instances temporarily offline   
 

  
 
 
 
 

 
 Ara Yapejian FYI I've put together a PR https://github.com/jenkinsci/ec2-plugin/pull/437  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.204884.1583219883000.16227.1587618960191%40Atlassian.JIRA.


[JIRA] (JENKINS-61304) EC2 Plugin terminates instances temporarily offline

2020-03-02 Thread umberto.nicole...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Umberto Nicoletti created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-61304  
 
 
  EC2 Plugin terminates instances temporarily offline   
 

  
 
 
 
 

 
Issue Type: 
  Bug  
 
 
Assignee: 
 FABRIZIO MANFREDI  
 
 
Components: 
 ec2-plugin  
 
 
Created: 
 2020-03-03 07:18  
 
 
Environment: 
 Jenkins 2.204.2  EC2 Plugin 1.49.1  
 
 
Priority: 
  Minor  
 
 
Reporter: 
 Umberto Nicoletti  
 

  
 
 
 
 

 
 The latest version of the plugin added code to catch instances that were started, and then failed to connect to Jenkins. Previously such instances would stay up forever incurring unnecessary costs. The code block that adds this functionality is: https://github.com/jenkinsci/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/EC2RetentionStrategy.java#L167-L185   Note hos whe termination branch is guarded by the clause `computer.isOffline()` which is unfortunately too broad. Looking up the definition of `computer.isOffline()` yields this code: {{ @Exported public boolean isOffline() {}} return temporarilyOffline || getChannel()==null; }   This reveals that temporarilyOffline is considered too: as a result an instance that has sucessfully connected, but is, for example, under temporary maintenance will be abruptly terminated by the plugin. AFAICT there is no simple workaround, besides perhaps rewriting `if (computer.isOffline()){` to `if (computer.getChannel()==null){`   I'll try to submit a patch asap  
 

  
 
 
 
 

 
 
   

[JIRA] (JENKINS-57215) Plugin starts a worked and might immediately stop it, because of cached EC2Computer.getUptime()

2019-04-29 Thread umberto.nicole...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Umberto Nicoletti created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-57215  
 
 
  Plugin starts a worked and might immediately stop it, because of cached EC2Computer.getUptime()   
 

  
 
 
 
 

 
Issue Type: 
  Bug  
 
 
Assignee: 
 Umberto Nicoletti  
 
 
Components: 
 ec2-plugin  
 
 
Created: 
 2019-04-29 08:35  
 
 
Environment: 
 Jenkins ver. 2.164.2  Amazon EC2 plugin 1.42  
 
 
Priority: 
  Minor  
 
 
Reporter: 
 Umberto Nicoletti  
 

  
 
 
 
 

 
 AFAIU there's a race condition in EC2RetentionStrategy.java#L99   a few lines below the  call to `computer.getUptime()` will return a cached value, whereas `computer.getState();` will not. If the worker was just started, this might lead to a race condition where the uptime will be calculated on the previous start time, rather then the current, and state will instead correctly report running. As a result of this inconsistency the plugin will end up stopping the instance because it will falsely compute uptime from the previous launch time, rather the current one (time difference from previous launch time is most likely to be more that idle timeout, which for us is 60 minutes). Does not happen often, perhaps we can just change `computer.getUptime()` to return the actual value rather than a cached value? Ideally calls to `computer` methods should return a consistent view for all getters. I'm willing to provide a PR, if someone could provide guidance on the suggested solution. Thanks!