Re: [openstack-dev] [Glance] Concurrent update issue in Glance v2 API

2014-09-25 Thread Mark Washenberger
Thanks for diving on this grenade, Alex!

FWIW, I agree with all of your assessments. Just in case I am mistaken, I
summarize them as smaller updates > logical clocks > wall clocks (due to
imprecision and skew).

Given the small size of your patch [4], I'd say lets try to land that. It
is nicer to solve this problem with software rather than with db schema if
that is possible.

On Thu, Sep 25, 2014 at 9:21 AM, Alexander Tivelkov 
wrote:

> Hi folks!
>
> There is a serious issue [0] in the v2 API of Glance which may lead to
> race conditions during the concurrent updates of Images' metadata.
> It can be fixed in a number of ways, but we need to have some solution
> soon, as we are approaching rc1 release, and the race in image updates
> looks like a serious problem which has to be fixed in J, imho.
>
> A quick description of the problem:
> When the image-update is called (PUT /v2/images/%image_id%/) we get the
> image from the repository, which fetches a record from the DB and forms its
> content into an Image Domain Object ([1]), which is then modified (has its
> attributes updated) and passed through all the layers of our domain model.
> This object is not managed by the SQLAlchemy's session, so the
> modifications of its attributes are not tracked anywhere.
> When all the processing is done and the updated object is passed back to
> the DB repository, it serializes all the attributes of the image into a
> dict ([2]) and then this dict is used to create an UPDATE query for the
> database.
> As this serialization includes all the attribute of the object (rather
> then only the modified ones), the update query updates all the columns of
> the appropriate database row, putting there the values which were
> originally fetched when the processing began. This may obviously overwrite
> the values which could be written there by some other concurent request.
>
> There are two possible solutions to fix this problem.
> First, known as the optimistic concurrency control, checks if the
> appropriate database row was modified between the data fetching and data
> updates. In case of such modification the update operation reports a
> "conflict" and fails (and may be retried based on the updated data if
> needed). Modification detection is usually based on the timstamps, i.e. the
> query updates the row in database only if the timestamp there matches the
> timestamp of initially fetched data.
> I've introduced this approach in this patch [3], however it has a major
> flaw: I used the 'updated_at' attribute as a timestamp, and this attribute
> is mapped to a DateTime-typed column. In many RDBMS's (including
> MySql<5.6.4) this column stores values with per-second precision and does
> not store fractions of seconds. So, even if patch [3] is merged the race
> conditions may still occur if there are many updates happening at the same
> moment of time.
> A better approach would be to add a new column with int (or longint) type
> to store millisecond-based (or even microsecond-based) timestamps instead
> of (or additionally to) date-time based updated_at. But data model
> modification will require to add new migration etc, which is a major step
> and I don't know if we want to make it so close to the release.
>
> The second solution is to keep track of the changed attributes and
> properties for the image and do not include the unchanged ones into the
> UPDATE query, so nothing gets overwritten. This dramatically reduces the
> threat of races, as the updates of different properties do not interfere
> with each other. Also this is a usefull change regardless of the race
> itself: being able to differentiate between changed and unchanged
> attributes may have its own value for other purposes; the DB performance
> will also be better when updating just the needed fields instead of all of
> them.
> I've submitted a patch with this approach as well [4], but it still breaks
> some unittests and I am working to fix them right now.
>
> So, we need to decide which of these approaches (or their combination) to
> take: we may stick with optimistic locking on timestamp (and then decide if
> we are ok with a per-second timestamps or we need to add a new column),
> choose to track state of attributes or combine them together. So, could you
> folks please review patches [3] and [4] and come up with some ideas on them?
>
> Also, probably we should consider targeting [0] to juno-rc1 milestone to
> make sure that this bug is fixed in J. Do you guys think it is possible at
> this stage?
>
> Thanks!
>
>
> [0] https://bugs.launchpad.net/glance/+bug/1371728
> [1]
> https://github.com/openstack/glance/blob/master/glance/db/__init__.py#L74
> [2]
> https://github.com/openstack/glance/blob/master/glance/db/__init__.py#L169
> [3] https://review.openstack.org/#/c/122814/
> [4] https://review.openstack.org/#/c/123722/
>
> --
> Regards,
> Alexander Tivelkov
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lis

[openstack-dev] [Glance] Concurrent update issue in Glance v2 API

2014-09-25 Thread Alexander Tivelkov
Hi folks!

There is a serious issue [0] in the v2 API of Glance which may lead to race
conditions during the concurrent updates of Images' metadata.
It can be fixed in a number of ways, but we need to have some solution
soon, as we are approaching rc1 release, and the race in image updates
looks like a serious problem which has to be fixed in J, imho.

A quick description of the problem:
When the image-update is called (PUT /v2/images/%image_id%/) we get the
image from the repository, which fetches a record from the DB and forms its
content into an Image Domain Object ([1]), which is then modified (has its
attributes updated) and passed through all the layers of our domain model.
This object is not managed by the SQLAlchemy's session, so the
modifications of its attributes are not tracked anywhere.
When all the processing is done and the updated object is passed back to
the DB repository, it serializes all the attributes of the image into a
dict ([2]) and then this dict is used to create an UPDATE query for the
database.
As this serialization includes all the attribute of the object (rather then
only the modified ones), the update query updates all the columns of the
appropriate database row, putting there the values which were originally
fetched when the processing began. This may obviously overwrite the values
which could be written there by some other concurent request.

There are two possible solutions to fix this problem.
First, known as the optimistic concurrency control, checks if the
appropriate database row was modified between the data fetching and data
updates. In case of such modification the update operation reports a
"conflict" and fails (and may be retried based on the updated data if
needed). Modification detection is usually based on the timstamps, i.e. the
query updates the row in database only if the timestamp there matches the
timestamp of initially fetched data.
I've introduced this approach in this patch [3], however it has a major
flaw: I used the 'updated_at' attribute as a timestamp, and this attribute
is mapped to a DateTime-typed column. In many RDBMS's (including
MySql<5.6.4) this column stores values with per-second precision and does
not store fractions of seconds. So, even if patch [3] is merged the race
conditions may still occur if there are many updates happening at the same
moment of time.
A better approach would be to add a new column with int (or longint) type
to store millisecond-based (or even microsecond-based) timestamps instead
of (or additionally to) date-time based updated_at. But data model
modification will require to add new migration etc, which is a major step
and I don't know if we want to make it so close to the release.

The second solution is to keep track of the changed attributes and
properties for the image and do not include the unchanged ones into the
UPDATE query, so nothing gets overwritten. This dramatically reduces the
threat of races, as the updates of different properties do not interfere
with each other. Also this is a usefull change regardless of the race
itself: being able to differentiate between changed and unchanged
attributes may have its own value for other purposes; the DB performance
will also be better when updating just the needed fields instead of all of
them.
I've submitted a patch with this approach as well [4], but it still breaks
some unittests and I am working to fix them right now.

So, we need to decide which of these approaches (or their combination) to
take: we may stick with optimistic locking on timestamp (and then decide if
we are ok with a per-second timestamps or we need to add a new column),
choose to track state of attributes or combine them together. So, could you
folks please review patches [3] and [4] and come up with some ideas on them?

Also, probably we should consider targeting [0] to juno-rc1 milestone to
make sure that this bug is fixed in J. Do you guys think it is possible at
this stage?

Thanks!


[0] https://bugs.launchpad.net/glance/+bug/1371728
[1]
https://github.com/openstack/glance/blob/master/glance/db/__init__.py#L74
[2]
https://github.com/openstack/glance/blob/master/glance/db/__init__.py#L169
[3] https://review.openstack.org/#/c/122814/
[4] https://review.openstack.org/#/c/123722/

--
Regards,
Alexander Tivelkov
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev