[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Tutkowski reassigned CLOUDSTACK-9917:
------------------------------------------

    Assignee: Mike Tutkowski

> Root disks stranded on primary storage
> --------------------------------------
>
>                 Key: CLOUDSTACK-9917
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9917
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.10.0.0
>         Environment: N/A
>            Reporter: Mike Tutkowski
>            Assignee: Mike Tutkowski
>            Priority: Blocker
>             Fix For: 4.10.0.0
>
>
> From dev@
>    OK, here’s the gist of the problem:
>    In StorageManagerImpl.cleanupStorage(boolean), the following line in 4.9
>                        List<VolumeVO> vols = 
> _volsDao.listVolumesToBeDestroyed(new Date(System.currentTimeMillis() - 
> ((long) StorageCleanupDelay.value() << 10)));
>    was changed to the following in 4.10
>                        // ROOT volumes will be destroyed as part of VM cleanup
>                        List<VolumeVO> vols = 
> _volsDao.listNonRootVolumesToBeDestroyed(new Date(System.currentTimeMillis() 
> - ((long) StorageCleanupDelay.value() << 10)));
>    This leads to a problem (for both managed and traditional storage) in the 
> following situation:
>    For example: Let’s say we have a system VM running on NFS primary storage. 
> We then put this primary storage into maintenance mode, which creates the 
> system VM (with the same name) on a different primary storage (we do not 
> create a new row in the cloud.vm_instance table for this VM). While this VM 
> works, the original root disk of the system VM remains on the original 
> primary storage and is not destroyed by the code in 
> StorageManagerImpl.cleanupStorage(boolean) in 4.10 because 4.10 (as shown 
> above) only asks for non-root volumes to consider for deletion. In the 4.9 
> version of the code, the original root disk is cleaned up in 
> StorageManagerImpl.cleanupStorage(boolean). The problem with 4.10 relying on 
> a root disk always being deleted when the VM it belongs to is deleted is that 
> in a situation like this that the system VM doesn’t get deleted at this point 
> – it gets a new root disk that’s hosted by a different primary storage (so 
> now it’s original root disk is stranded).
>    Here is the ticket and the PR where the code change went in:
>    https://issues.apache.org/jira/browse/CLOUDSTACK-9660
>    https://github.com/apache/cloudstack/pull/1825
>    To me, this needs to be fixed before we release 4.10, so I am -1 on this 
> RC.
>    My suggestion would be to basically revert PR 1825 and to make use of just 
> bits and pieces of it.
>    For example, this part should be kept:
>    -                            
> volService.expungeVolumeAsync(volFactory.getVolume(vol.getId()));
>     +                            VolumeInfo volumeInfo = 
> volFactory.getVolume(vol.getId());
>     +                            if (volumeInfo != null) {
>     +                                
> volService.expungeVolumeAsync(volumeInfo);
>     +                            } else {
>     +                                s_logger.debug("Volume " + vol.getUuid() 
> + " is already destroyed");
>     +                            }



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to