[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Vazquez updated CLOUDSTACK-9386:
----------------------------------------
    Description: 
h3. Introduction
In some production environments with multiple clusters it was noticed that 
unused templates were consuming too much storage. It was discovered that 
template cleanup was not deleting marked templates on ESXi.

h3. Description of the problem
Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and 
template {{T}} from which we deploy vms on {{c1}}.
Suppose now that we expunge those vms, and there's no other vm instance from 
template {{T}}, so this was the actual workflow:
# CloudStack marks template for cleanup after {{storage.cleanup.interval}} 
seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table, for 
that template.
# After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}} will 
be sent, to delete template from primary storage
# On {{VmwareResource}}, command is processed, and it first picks up a random 
cluster, say {{ci != c1}} to look for vm template (using volume's path) and 
destroy it. But, as template was on {{c1}} it cannot be found, so it won't be 
deleted. Entry on {{template_spool_ref}} is deleted but not the actual template 
on hypervisor side.

h3. Proposed solution
We propose a way to attack problem shown in point 3, by not picking up a random 
cluster to look for vm but using vSphere data center. This way we make sure vm 
template will be deleted in every case, and not depending on random cluster 
selection

  was:
h3. Introduction
In some production environments with multiple clusters it was noticed that 
unused templates were consuming too much storage. It was discovered that 
template cleanup was not deleting marked templates on ESXi.

h3. Description of the problem
Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and 
template {{T}} from which we deploy vms on {{c1}}.
Suppose now that we expunge those vms, and there's no other vm instance from 
template {{T}}, so this was the actual workflow:
# CloudStack marks template for cleanup after {{storage.cleanup.interval}} 
seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table, for 
that template.
# After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}} will 
be sent, to delete template from primary storage
# On {{VmwareResource}}, command is processed, and it first picks up a random 
cluster, say {{ci != c1}} to look for vm template (using volume's path) and 
destroy it. But, as template was on {{c1}} it cannot be found, so it won't be 
deleted. Entry on {{template_spool_ref}} is deleted but not the actual template 
on hypervisor side.

h3. Proposed solution
We propose a way to attack problem shown in point 3, by not picking up a random 
cluster to look for vm but using data store. This way we make sure vm template 
will be deleted in every case, and not depending on random cluster selection


> DS template copies don’t get deleted in VMware ESXi with multiple clusters 
> and zone wide storage
> ------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9386
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9386
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: VMware
>    Affects Versions: 4.9.0
>            Reporter: Nicolas Vazquez
>            Assignee: Nicolas Vazquez
>             Fix For: 4.9.0
>
>
> h3. Introduction
> In some production environments with multiple clusters it was noticed that 
> unused templates were consuming too much storage. It was discovered that 
> template cleanup was not deleting marked templates on ESXi.
> h3. Description of the problem
> Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and 
> template {{T}} from which we deploy vms on {{c1}}.
> Suppose now that we expunge those vms, and there's no other vm instance from 
> template {{T}}, so this was the actual workflow:
> # CloudStack marks template for cleanup after {{storage.cleanup.interval}} 
> seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table, 
> for that template.
> # After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}} 
> will be sent, to delete template from primary storage
> # On {{VmwareResource}}, command is processed, and it first picks up a random 
> cluster, say {{ci != c1}} to look for vm template (using volume's path) and 
> destroy it. But, as template was on {{c1}} it cannot be found, so it won't be 
> deleted. Entry on {{template_spool_ref}} is deleted but not the actual 
> template on hypervisor side.
> h3. Proposed solution
> We propose a way to attack problem shown in point 3, by not picking up a 
> random cluster to look for vm but using vSphere data center. This way we make 
> sure vm template will be deleted in every case, and not depending on random 
> cluster selection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to