[
https://issues.apache.org/jira/browse/CLOUDSTACK-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicolas Vazquez updated CLOUDSTACK-9386:
----------------------------------------
Description:
h3. Introduction
In some production environments with multiple clusters it was noticed that
unused templates were consuming too much storage. It was discovered that
template cleanup was not deleting marked templates on ESXi.
h3. Description of the problem
Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and
template {{T}} from which we deploy vms on {{c1}}.
Suppose now that we expunge those vms, and there's no other vm instance from
template {{T}}, so this was the actual workflow:
# CloudStack marks template for cleanup after {{storage.cleanup.interval}}
seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table, for
that template.
# After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}} will
be sent, to delete template from primary storage
# On {{VmwareResource}}, command is processed, and it first picks up a random
cluster, say {{ci != c1}} to look for vm template (using volume's path) and
destroy it. But, as template was on {{c1}} it cannot be found, so it won't be
deleted. Entry on {{template_spool_ref}} is deleted but not the actual template
on hypervisor side.
h3. Proposed solution
We propose a way to attack problem shown in point 3, by not picking up a random
cluster to look for vm but using vSphere data center. This way we make sure vm
template will be deleted in every case, and not depending on random cluster
selection
was:
h3. Introduction
In some production environments with multiple clusters it was noticed that
unused templates were consuming too much storage. It was discovered that
template cleanup was not deleting marked templates on ESXi.
h3. Description of the problem
Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and
template {{T}} from which we deploy vms on {{c1}}.
Suppose now that we expunge those vms, and there's no other vm instance from
template {{T}}, so this was the actual workflow:
# CloudStack marks template for cleanup after {{storage.cleanup.interval}}
seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table, for
that template.
# After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}} will
be sent, to delete template from primary storage
# On {{VmwareResource}}, command is processed, and it first picks up a random
cluster, say {{ci != c1}} to look for vm template (using volume's path) and
destroy it. But, as template was on {{c1}} it cannot be found, so it won't be
deleted. Entry on {{template_spool_ref}} is deleted but not the actual template
on hypervisor side.
h3. Proposed solution
We propose a way to attack problem shown in point 3, by not picking up a random
cluster to look for vm but using data store. This way we make sure vm template
will be deleted in every case, and not depending on random cluster selection
> DS template copies don’t get deleted in VMware ESXi with multiple clusters
> and zone wide storage
> ------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-9386
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9386
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: VMware
> Affects Versions: 4.9.0
> Reporter: Nicolas Vazquez
> Assignee: Nicolas Vazquez
> Fix For: 4.9.0
>
>
> h3. Introduction
> In some production environments with multiple clusters it was noticed that
> unused templates were consuming too much storage. It was discovered that
> template cleanup was not deleting marked templates on ESXi.
> h3. Description of the problem
> Suppose we have multiple clusters {{(c1, c2,...,cN)}} on a data center and
> template {{T}} from which we deploy vms on {{c1}}.
> Suppose now that we expunge those vms, and there's no other vm instance from
> template {{T}}, so this was the actual workflow:
> # CloudStack marks template for cleanup after {{storage.cleanup.interval}}
> seconds, by setting {{marked_for_gc = 1}} on {{template_spool_ref}} table,
> for that template.
> # After another {{storage.cleanup.interval}} seconds a {{DestroyCommand}}
> will be sent, to delete template from primary storage
> # On {{VmwareResource}}, command is processed, and it first picks up a random
> cluster, say {{ci != c1}} to look for vm template (using volume's path) and
> destroy it. But, as template was on {{c1}} it cannot be found, so it won't be
> deleted. Entry on {{template_spool_ref}} is deleted but not the actual
> template on hypervisor side.
> h3. Proposed solution
> We propose a way to attack problem shown in point 3, by not picking up a
> random cluster to look for vm but using vSphere data center. This way we make
> sure vm template will be deleted in every case, and not depending on random
> cluster selection
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)