[JIRA] (JENKINS-61304) EC2 Plugin terminates instances temporarily offline
Title: Message Title Umberto Nicoletti commented on JENKINS-61304 Re: EC2 Plugin terminates instances temporarily offline Ara Yapejian FYI I've put together a PR https://github.com/jenkinsci/ec2-plugin/pull/437 Add Comment This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-issues/JIRA.204884.1583219883000.16227.1587618960191%40Atlassian.JIRA.
[JIRA] (JENKINS-61304) EC2 Plugin terminates instances temporarily offline
Title: Message Title Umberto Nicoletti created an issue Jenkins / JENKINS-61304 EC2 Plugin terminates instances temporarily offline Issue Type: Bug Assignee: FABRIZIO MANFREDI Components: ec2-plugin Created: 2020-03-03 07:18 Environment: Jenkins 2.204.2 EC2 Plugin 1.49.1 Priority: Minor Reporter: Umberto Nicoletti The latest version of the plugin added code to catch instances that were started, and then failed to connect to Jenkins. Previously such instances would stay up forever incurring unnecessary costs. The code block that adds this functionality is: https://github.com/jenkinsci/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/EC2RetentionStrategy.java#L167-L185 Note hos whe termination branch is guarded by the clause `computer.isOffline()` which is unfortunately too broad. Looking up the definition of `computer.isOffline()` yields this code: {{ @Exported public boolean isOffline() {}} return temporarilyOffline || getChannel()==null; } This reveals that temporarilyOffline is considered too: as a result an instance that has sucessfully connected, but is, for example, under temporary maintenance will be abruptly terminated by the plugin. AFAICT there is no simple workaround, besides perhaps rewriting `if (computer.isOffline()){` to `if (computer.getChannel()==null){` I'll try to submit a patch asap
[JIRA] (JENKINS-57215) Plugin starts a worked and might immediately stop it, because of cached EC2Computer.getUptime()
Title: Message Title Umberto Nicoletti created an issue Jenkins / JENKINS-57215 Plugin starts a worked and might immediately stop it, because of cached EC2Computer.getUptime() Issue Type: Bug Assignee: Umberto Nicoletti Components: ec2-plugin Created: 2019-04-29 08:35 Environment: Jenkins ver. 2.164.2 Amazon EC2 plugin 1.42 Priority: Minor Reporter: Umberto Nicoletti AFAIU there's a race condition in EC2RetentionStrategy.java#L99 a few lines below the call to `computer.getUptime()` will return a cached value, whereas `computer.getState();` will not. If the worker was just started, this might lead to a race condition where the uptime will be calculated on the previous start time, rather then the current, and state will instead correctly report running. As a result of this inconsistency the plugin will end up stopping the instance because it will falsely compute uptime from the previous launch time, rather the current one (time difference from previous launch time is most likely to be more that idle timeout, which for us is 60 minutes). Does not happen often, perhaps we can just change `computer.getUptime()` to return the actual value rather than a cached value? Ideally calls to `computer` methods should return a consistent view for all getters. I'm willing to provide a PR, if someone could provide guidance on the suggested solution. Thanks!