andrijapanicsb commented on issue #4018: Snapshots GC from DB - needs 
refractoring and fixing snapshot_store_ref garbage
URL: https://github.com/apache/cloudstack/issues/4018#issuecomment-611693552
 
 
   Thinking aloud about the **correct workflow for different hypervisors,** 
after being busy with the snapshots issues on PR #3969  for 40+h...
   
   ### XS
   
   - XS snap issues: although we "try", we don't really delete older snaps than 
the current/new one when it's created - we should fix this, and then all older 
snaps are deleted from Primary and so are their references PRIMARY in 
snap_store_ref. We are chaining, why are we chaining... We are copying over 
FULL snaps from Primary to Secondary anyways.....) 
   GC will pick up everything properly later (it does so in 4.13.1/4.14, as 
part of PR #3969 ) - and the issue of Primary Storage snap garbage for XS would 
be solved (testing with just creating one snap and deleting it is NOT valid - 
as the PRIMARY row from snap_store_ref is not deleted - it's deleted for 
PREVIOUS snaps, and the very first snap has NO previous snaps...
   - GC does remove the last snap (which does have PRIMARY reference in 
snap_store_ref) from the Primary Storage - so check that code, and implement in 
this createSnapshots where it supposedly deletes the older snaps...
   - previous snaps deletion failure (when new snap is created) - perhaps it's 
a race case where the PRIMARY row is first deleted and then code looks for 
references to delete on Primary Storage and since no references, nothing gets 
deleted...
   
   ## VMware (current state)
   - VMware - when snapshots is created on Secondary Storage, nothing is left 
on Primary, although we leave the PRIMARY row in snap_Store_ref.
   - Deleting the template will delete it on Secondary Storage immediately, and 
mark all snap_stor_ref -rows as DESTROYED, and same happens for snap row in the 
main snapshots table.
   - GC seems to be looking ONLY for IMAGE rows in snap_store_ref table  - it's 
looking only for those in the DESTROYED state where the corresponding row in 
the main snapshots table does NOT have a "removed" date set. It seems 
(conclusion by testing...) that it does NOT look for (was not designed for?) to 
even look for PRIMARY references in snap_store_ref table, as the identical 
behaviour is observed for KVM and XS GC stuff - so "remove from Secondary 
Storage if not already removed (should be, unless SSVM was down) and remove the 
IMAGE row from snap_store_ref"
   - ^^^ One more reason that we "copy" the logic from XS, to remove the 
PRIMARY row when the new snap is created (copied to Secondary Storage) - so we 
only keep IMAGE rows when snap is created in snap_store_ref table
   
   ## KVM (current state)
   - KVM - identical stuff happens as with VMware ( i.e. PRIMARY row is left 
for whatever reason when snap is created, while there is NO snap on QCOW2 file 
- _but keep in mind that when NOT backing up to Sec Storage (works with 
KVM+Ceph only), then we do need/have the PRIMARY row_
   - as GC seems not designed to remove PRIMARY ref rows???, it's never 
removed... (GC does remove image from Secondary Storage if not deleted 
previously as it should (i.e. SSVM was down during snap deletion) and only 
removes the IMAGE ref row)
   
   
   We still do have a special case of "snapshot.backup.to.secondary=FALSE" 
which works for KVM+Ceph only and there we have only the PRIMARY row created in 
the first place... that seems to be working fine? (GC stuff, deletion in both 
cases when we backup or not to Secondary Storage

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to