Public bug reported: snapshot_volume_backed() in compute.API does not set a task_state during execution. However, in essence it does:
if vm_state == ACTIVE: quiesce() snapshot() if vm_state == ACTIVE: unquiesce() There is no exclusion here, though, which means a user could do: quiesce() quiesce() snapshot() snapshot() unquiesce() --snapshot() now running after unquiesce -> corruption unquiesce() or: suspend() snapshot() NO QUIESCE (we're suspended) snapshot() resume() --snapshot() now running after resume -> corruption Same goes for stop/start. Note that snapshot_volume_backed() is a separate top-level entry point from snapshot(). snapshot() does not suffer from this problem, because it atomically sets the task state to IMAGE_SNAPSHOT_PENDING when running, which prevents the user from performing a concurrent operation on the instance. I suggest that snapshot_volume_backed() should do the same. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1619606 Title: snapshot_volume_backed races, could result in data corruption Status in OpenStack Compute (nova): New Bug description: snapshot_volume_backed() in compute.API does not set a task_state during execution. However, in essence it does: if vm_state == ACTIVE: quiesce() snapshot() if vm_state == ACTIVE: unquiesce() There is no exclusion here, though, which means a user could do: quiesce() quiesce() snapshot() snapshot() unquiesce() --snapshot() now running after unquiesce -> corruption unquiesce() or: suspend() snapshot() NO QUIESCE (we're suspended) snapshot() resume() --snapshot() now running after resume -> corruption Same goes for stop/start. Note that snapshot_volume_backed() is a separate top-level entry point from snapshot(). snapshot() does not suffer from this problem, because it atomically sets the task state to IMAGE_SNAPSHOT_PENDING when running, which prevents the user from performing a concurrent operation on the instance. I suggest that snapshot_volume_backed() should do the same. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1619606/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp