Stephen O'Donnell created HDDS-10527:
----------------------------------------
Summary: KeyOverwrite with optimisitic locking
Key: HDDS-10527
URL: https://issues.apache.org/jira/browse/HDDS-10527
Project: Apache Ozone
Issue Type: Bug
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
This change introduces the ability to re-create / overwrite a key in Ozone
using an optimistic locking technique.
Say there is a desire to replace a key with some additional data added
somewhere in the key, or perhaps change its replication type from Ratis to EC.
To do this, you can read the current key data, write a new key with the same
name, and then on commitKey, the new key version will be visible.
However, there is a possibility that some other client deletes the original
key, or re-writes it at the same time, resulting in potential lost updates.
To replace a key in this way, the proposal is to use the existing objectID and
updateID on the key to ensure the key has not changed since it was read. The
flow would be:
1. Get the keyInfo for the current key.
2. Call the new bucket.overWriteKey() method, passing the details of the
existing key
3. This call will adjust the keyArgs to pass two new fields - overwriteObjectID
and updateObjectID which are taken from the objectID and updateID of the
existing key.
4. When OM receives the open key request, it checks that an existing key is
present having the passed keyname, objectID and updateID. If not, an error is
returned. Otherwise the key is added to the openKeyTable, storing the overwrite
IDs.
5. The data is written to the key as usual.
6. On key commit, the values stored in the openKey table for the overwrite IDs
are checked against the current key. If the current key is absent, or its IDs
have changed, the commit will fail and an error is returned. Otherwise the key
is committed as usual.
This technique is similar to optimistic locking used in relational databases,
to avoid holding a lock on an object for a long period of time.
Notably there are no additional locks needed on OM and no additional calls or
rocksDB reads required to implement this - passing and storing the IDs in the
openKey table is all that is required. The overwriteIDs don't need to be
stored in the keyTable.
This change only added the feature for Object Store buckets for now.
Additionally, there is a question over what to do about meta-data and ACLs.
Should they be copied from the existing key, or passed from the client.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]