Thanks for diving on this grenade, Alex!

FWIW, I agree with all of your assessments. Just in case I am mistaken, I
summarize them as smaller updates > logical clocks > wall clocks (due to
imprecision and skew).

Given the small size of your patch [4], I'd say lets try to land that. It
is nicer to solve this problem with software rather than with db schema if
that is possible.

On Thu, Sep 25, 2014 at 9:21 AM, Alexander Tivelkov <>

> Hi folks!
> There is a serious issue [0] in the v2 API of Glance which may lead to
> race conditions during the concurrent updates of Images' metadata.
> It can be fixed in a number of ways, but we need to have some solution
> soon, as we are approaching rc1 release, and the race in image updates
> looks like a serious problem which has to be fixed in J, imho.
> A quick description of the problem:
> When the image-update is called (PUT /v2/images/%image_id%/) we get the
> image from the repository, which fetches a record from the DB and forms its
> content into an Image Domain Object ([1]), which is then modified (has its
> attributes updated) and passed through all the layers of our domain model.
> This object is not managed by the SQLAlchemy's session, so the
> modifications of its attributes are not tracked anywhere.
> When all the processing is done and the updated object is passed back to
> the DB repository, it serializes all the attributes of the image into a
> dict ([2]) and then this dict is used to create an UPDATE query for the
> database.
> As this serialization includes all the attribute of the object (rather
> then only the modified ones), the update query updates all the columns of
> the appropriate database row, putting there the values which were
> originally fetched when the processing began. This may obviously overwrite
> the values which could be written there by some other concurent request.
> There are two possible solutions to fix this problem.
> First, known as the optimistic concurrency control, checks if the
> appropriate database row was modified between the data fetching and data
> updates. In case of such modification the update operation reports a
> "conflict" and fails (and may be retried based on the updated data if
> needed). Modification detection is usually based on the timstamps, i.e. the
> query updates the row in database only if the timestamp there matches the
> timestamp of initially fetched data.
> I've introduced this approach in this patch [3], however it has a major
> flaw: I used the 'updated_at' attribute as a timestamp, and this attribute
> is mapped to a DateTime-typed column. In many RDBMS's (including
> MySql<5.6.4) this column stores values with per-second precision and does
> not store fractions of seconds. So, even if patch [3] is merged the race
> conditions may still occur if there are many updates happening at the same
> moment of time.
> A better approach would be to add a new column with int (or longint) type
> to store millisecond-based (or even microsecond-based) timestamps instead
> of (or additionally to) date-time based updated_at. But data model
> modification will require to add new migration etc, which is a major step
> and I don't know if we want to make it so close to the release.
> The second solution is to keep track of the changed attributes and
> properties for the image and do not include the unchanged ones into the
> UPDATE query, so nothing gets overwritten. This dramatically reduces the
> threat of races, as the updates of different properties do not interfere
> with each other. Also this is a usefull change regardless of the race
> itself: being able to differentiate between changed and unchanged
> attributes may have its own value for other purposes; the DB performance
> will also be better when updating just the needed fields instead of all of
> them.
> I've submitted a patch with this approach as well [4], but it still breaks
> some unittests and I am working to fix them right now.
> So, we need to decide which of these approaches (or their combination) to
> take: we may stick with optimistic locking on timestamp (and then decide if
> we are ok with a per-second timestamps or we need to add a new column),
> choose to track state of attributes or combine them together. So, could you
> folks please review patches [3] and [4] and come up with some ideas on them?
> Also, probably we should consider targeting [0] to juno-rc1 milestone to
> make sure that this bug is fixed in J. Do you guys think it is possible at
> this stage?
> Thanks!
> [0]
> [1]
> [2]
> [3]
> [4]
> --
> Regards,
> Alexander Tivelkov
> _______________________________________________
> OpenStack-dev mailing list
OpenStack-dev mailing list

Reply via email to