At the blueprint review for the Trove team on April 14th, we reviewed "Allow 
volume snapshot as another way for data backuping" 
(https://blueprints.launchpad.net/trove/+spec/volume-snapshot).



The detailed specification is available at 
https://wiki.openstack.org/wiki/Trove/volume-data-snapshot-design



In this blueprint, the intent is to provide a mechanism by which a snapshot of 
the data store can be generated for the purpose of backup.



The theory is that the backup workflow would be in three steps:



1. flush database and mark read-only

2. take snapshot

3. revert database to normal operation



In practice, if using something like LVM, the database need be read-only only 
for enough time to start the snapshot as LVM would ensure that the snapshot is 
consistent as of some point in time. However, even with that, many details 
remain that are potentially very important.



IMHO the blueprint should be significantly revised before re-review.



(a) In particular, it would be good to understand exactly what the limitations 
of the proposed implementation are likely to be. This could include things like 
configurations that would work, storage engines that would be supported (or 
not), performance considerations, etc.,



(b) It would also be important in this particular case to understand (for 
datastores that will be implemented in this iteration) how the system will 
achieve step 1 above.



(c) Given that this is supposed to be a feature that would work for multiple 
datastores, ease of implementing a new data store should be given some 
consideration. It would be worthwhile if the blueprint included information on 
how this would be for a datastore to be implemented in the future.



One recommendation is that a simple illustration of the framework being 
implemented here (that could be shared across datastores) is provided, as well 
as details about the implementation (maybe only at pseudo-code level) of the 
datastores that will be implemented in the initial phase. One could easily 
conceive of a framework that provided default stubs that meaningfully responded 
with a "Not Supported" error, and an implementer could provide the appropriate 
code for the datastore in question and things would Just Work(tm)



(d) Some more information about how the 'restore' would be orchestrated would 
be valuable. Similar information as requested above for the 'backup' would be 
worthwhile adding to the blueprint.



(e) With respect to 'restore', [this point was brought up during the review by 
dougshelley66 and vgnbkr] the "point in time recovery" blueprint review just 
before this one yielded a significant discussion around restoring a backup over 
top of the running instance. The concerns were that doing this could result in 
the loss of user data. In the end, that feature was removed from the blueprint. 
It seems like this blueprint has the same property on restore - that being - 
that restoring the snapshot would eliminate the volume that currently houses 
the user's data. Why wouldn't that be a concern for this approach?



I'm not intending to summarize the discussion, nor present feedback from all 
parties to the review; others at the review had other concerns as well and I'll 
let them contribute those to this mail thread if they so desire.

-amrith

--

Amrith Kumar, CTO, Tesora

Twitter: @amrithkumar
Email: amr...@tesora.com
Skype: amrith.skype
Web: http://www.tesora.com

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to