Re: [openstack-dev] [cinder][nova] proper syncing of cinder volume state
John: States that the driver can/should do some cleanup work during the transition: attaching -> available or error detaching -> available or error error -> available or error deleting -> deleted or error_deleting Also in possibly wanted in future but much harder: backing_up -> available or error (need to make sure the backup service copes) restoring -> error (need to make sure the backup service copes) I haven't looked at the entire state space yet, these are the obvious ones off the top of my head On 1 December 2014 at 06:30, John Griffith wrote: > On Fri, Nov 28, 2014 at 11:25 AM, D'Angelo, Scott > wrote: > > A Cinder blueprint has been submitted to allow the python-cinderclient to > > involve the back end storage driver in resetting the state of a cinder > > volume: > > > > https://blueprints.launchpad.net/cinder/+spec/reset-state-with-driver > > > > and the spec: > > > > https://review.openstack.org/#/c/134366 > > > > > > > > This blueprint contains various use cases for a volume that may be > listed in > > the Cinder DataBase in state detaching|attaching|creating|deleting. > > > > The Proposed solution involves augmenting the python-cinderclient command > > ‘reset-state’, but other options are listed, including those that > > > > involve Nova, since the state of a volume in the Nova XML found in > > /etc/libvirt/qemu/.xml may also be out-of-sync with the > > > > Cinder DB or storage back end. > > > > > > > > A related proposal for adding a new non-admin API for changing volume > status > > from ‘attaching’ to ‘error’ has also been proposed: > > > > https://review.openstack.org/#/c/137503/ > > > > > > > > Some questions have arisen: > > > > 1) Should ‘reset-state’ command be changed at all, since it was > originally > > just to modify the Cinder DB? > > > > 2) Should ‘reset-state’ be fixed to prevent the naïve admin from changing > > the CinderDB to be out-of-sync with the back end storage? > > > > 3) Should ‘reset-state’ be kept the same, but augmented with new options? > > > > 4) Should a new command be implemented, with possibly a new admin API to > > properly sync state? > > > > 5) Should Nova be involved? If so, should this be done as a separate > body of > > work? > > > > > > > > This has proven to be a complex issue and there seems to be a good bit of > > interest. Please provide feedback, comments, and suggestions. > > > > > > ___ > > OpenStack-dev mailing list > > OpenStack-dev@lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > Hey Scott, > > Thanks for posting this to the ML, I stated my opinion on the spec, > but for completeness: > My feeling is that reset-state has morphed into something entirely > different than originally intended. That's actually great, nothing > wrong there at all. I strongly disagree with the statements that > "setting the status in the DB only is almost always the wrong thing to > do". The whole point was to allow the state to be changed in the DB > so the item could in most cases be deleted. There was never an intent > (that I'm aware of) to make this some sort of uber resync and heal API > call. > > All of that history aside, I think it would be great to add some > driver interaction here. I am however very unclear on what that would > actually include. For example, would you let a Volume's state be > changed from "Error-Attaching" to "In-Use" and just run through the > process of retyring an attach? To me that seems like a bad idea. I'm > much happier with the current state of changing the state form "Error" > to "Available" (and NOTHING else) so that an operation can be retried, > or the resource can be deleted. If you start allowing any state > transition (which sadly we've started to do) you're almost never going > to get things correct. This also covers almost every situation even > though it means you have to explicitly retry operations or steps (I > don't think that's a bad thing) and make the code significantly more > robust IMO (we have some issues lately with things being robust). > > My proposal would be to go back to limiting the things you can do with > reset-state (basicly make it so you can only release the resource back > to available) and add the driver interaction to clean up any mess if > possible. This could be a simple driver call added like > "make_volume_available" whereby the driver just ensures that there are > no attachments and well; honestly nothing else comes to mind as > being something the driver cares about here. The final option then > being to add some more power to force-delete. > > Is there anything other than attach that matters from a driver? If > people are talking error-recovery that to me is a whole different > topic and frankly I think we need to spend more time preventing errors > as opposed to trying to recover from them via new API calls. > > Curious to see if any other folks have input here? > > John > >
Re: [openstack-dev] [cinder][nova] proper syncing of cinder volume state
On Fri, Nov 28, 2014 at 11:25 AM, D'Angelo, Scott wrote: > A Cinder blueprint has been submitted to allow the python-cinderclient to > involve the back end storage driver in resetting the state of a cinder > volume: > > https://blueprints.launchpad.net/cinder/+spec/reset-state-with-driver > > and the spec: > > https://review.openstack.org/#/c/134366 > > > > This blueprint contains various use cases for a volume that may be listed in > the Cinder DataBase in state detaching|attaching|creating|deleting. > > The Proposed solution involves augmenting the python-cinderclient command > ‘reset-state’, but other options are listed, including those that > > involve Nova, since the state of a volume in the Nova XML found in > /etc/libvirt/qemu/.xml may also be out-of-sync with the > > Cinder DB or storage back end. > > > > A related proposal for adding a new non-admin API for changing volume status > from ‘attaching’ to ‘error’ has also been proposed: > > https://review.openstack.org/#/c/137503/ > > > > Some questions have arisen: > > 1) Should ‘reset-state’ command be changed at all, since it was originally > just to modify the Cinder DB? > > 2) Should ‘reset-state’ be fixed to prevent the naïve admin from changing > the CinderDB to be out-of-sync with the back end storage? > > 3) Should ‘reset-state’ be kept the same, but augmented with new options? > > 4) Should a new command be implemented, with possibly a new admin API to > properly sync state? > > 5) Should Nova be involved? If so, should this be done as a separate body of > work? > > > > This has proven to be a complex issue and there seems to be a good bit of > interest. Please provide feedback, comments, and suggestions. > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > Hey Scott, Thanks for posting this to the ML, I stated my opinion on the spec, but for completeness: My feeling is that reset-state has morphed into something entirely different than originally intended. That's actually great, nothing wrong there at all. I strongly disagree with the statements that "setting the status in the DB only is almost always the wrong thing to do". The whole point was to allow the state to be changed in the DB so the item could in most cases be deleted. There was never an intent (that I'm aware of) to make this some sort of uber resync and heal API call. All of that history aside, I think it would be great to add some driver interaction here. I am however very unclear on what that would actually include. For example, would you let a Volume's state be changed from "Error-Attaching" to "In-Use" and just run through the process of retyring an attach? To me that seems like a bad idea. I'm much happier with the current state of changing the state form "Error" to "Available" (and NOTHING else) so that an operation can be retried, or the resource can be deleted. If you start allowing any state transition (which sadly we've started to do) you're almost never going to get things correct. This also covers almost every situation even though it means you have to explicitly retry operations or steps (I don't think that's a bad thing) and make the code significantly more robust IMO (we have some issues lately with things being robust). My proposal would be to go back to limiting the things you can do with reset-state (basicly make it so you can only release the resource back to available) and add the driver interaction to clean up any mess if possible. This could be a simple driver call added like "make_volume_available" whereby the driver just ensures that there are no attachments and well; honestly nothing else comes to mind as being something the driver cares about here. The final option then being to add some more power to force-delete. Is there anything other than attach that matters from a driver? If people are talking error-recovery that to me is a whole different topic and frankly I think we need to spend more time preventing errors as opposed to trying to recover from them via new API calls. Curious to see if any other folks have input here? John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder][nova] proper syncing of cinder volume state
A Cinder blueprint has been submitted to allow the python-cinderclient to involve the back end storage driver in resetting the state of a cinder volume: https://blueprints.launchpad.net/cinder/+spec/reset-state-with-driver and the spec: https://review.openstack.org/#/c/134366 This blueprint contains various use cases for a volume that may be listed in the Cinder DataBase in state detaching|attaching|creating|deleting. The Proposed solution involves augmenting the python-cinderclient command 'reset-state', but other options are listed, including those that involve Nova, since the state of a volume in the Nova XML found in /etc/libvirt/qemu/.xml may also be out-of-sync with the Cinder DB or storage back end. A related proposal for adding a new non-admin API for changing volume status from 'attaching' to 'error' has also been proposed: https://review.openstack.org/#/c/137503/ Some questions have arisen: 1) Should 'reset-state' command be changed at all, since it was originally just to modify the Cinder DB? 2) Should 'reset-state' be fixed to prevent the naïve admin from changing the CinderDB to be out-of-sync with the back end storage? 3) Should 'reset-state' be kept the same, but augmented with new options? 4) Should a new command be implemented, with possibly a new admin API to properly sync state? 5) Should Nova be involved? If so, should this be done as a separate body of work? This has proven to be a complex issue and there seems to be a good bit of interest. Please provide feedback, comments, and suggestions. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev