[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958812#comment-15958812
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9864:
--------------------------------------------

Github user abhinandanprateek commented on a diff in the pull request:

    https://github.com/apache/cloudstack/pull/2030#discussion_r110145877
  
    --- Diff: 
plugins/hypervisors/vmware/src/com/cloud/hypervisor/vmware/manager/VmwareManagerImpl.java
 ---
    @@ -550,15 +552,21 @@ public boolean needRecycle(String workerTag) {
                 return true;
             }
     
    -        // disable time-out check until we have found out a VMware API 
that can check if
    -        // there are pending tasks on the subject VM
    -        /*
    -                if(System.currentTimeMillis() - startTick > 
_hungWorkerTimeout) {
    -                    if(s_logger.isInfoEnabled())
    -                        s_logger.info("Worker VM expired, seconds elapsed: 
" + (System.currentTimeMillis() - startTick) / 1000);
    -                    return true;
    -                }
    -         */
    +        // this time-out check was disabled
    +        // "until we have found out a VMware API that can check if there 
are pending tasks on the subject VM"
    +        // but as we expire jobs and those stale worker VMs stay around 
untill an MS reboot we opt in to have them removed anyway
    +        Long hungWorkerTimeout = 2 * 
(AsyncJobManagerImpl.JobExpireMinutes.value() + 
AsyncJobManagerImpl.JobCancelThresholdMinutes.value()) * MILISECONDS_PER_MINUTE;
    +        Long letsSayNow = System.currentTimeMillis();
    +        if(s_vmwareCleanOldWorderVMs.value() && letsSayNow - startTick > 
hungWorkerTimeout) {
    +            if(s_logger.isInfoEnabled()) {
    +                s_logger.info("Worker VM expired, seconds elapsed: " + 
(System.currentTimeMillis() - startTick) / 1000);
    +            }
    --- End diff --
    
    For timeouts you may want to use java Duration, that is much cleaner.


> cleanup stale worker VMs after job expiry time
> ----------------------------------------------
>
>                 Key: CLOUDSTACK-9864
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864
>             Project: CloudStack
>          Issue Type: Improvement
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: VMware
>            Reporter: Daan Hoogland
>            Assignee: Daan Hoogland
>              Labels: vmware, vsphere, workers
>
> In the present code cleaning worker vms after a timeout is disabled, with the 
> documented reason that there is no API to query for related tasks in vcenter. 
> ACS has an expiry time for jobs and a cancel time for jobs.
> - Jobs that take longer then the expiry time will have their results be be 
> neglected.
> - Jobs that are cancelled are forcibly removed after the cancellation expity 
> time.
> Any worker remaining after expiry+cancellation will surely be stale and can 
> be removed.
> As some administrators may not want this behaviour there will be a setting 
> which by default is false that will guard against cleaning stale worker VMs.
> Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) 
> as a safe margin.
> related settings:
> job.expire.minutes: 1440
> job.cancel.threshold.minutes: 60
> vmware.clean.old.worker.vms: false (new)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to