Sadanand Shenoy created HDDS-7906:
-------------------------------------
Summary: Wait for checkpoint creation if snapshot in cache and not
committed to DB.
Key: HDDS-7906
URL: https://issues.apache.org/jira/browse/HDDS-7906
Project: Apache Ozone
Issue Type: Bug
Reporter: Sadanand Shenoy
Assignee: Sadanand Shenoy
For most OM requests like create, delete , rename etc, the flow in HA is as
follows
-> Create OmRequest -> PreExecute -> validateAndUpdateCache -> finally update
the OM Rocksdb.
validateAndUpdateCache basically takes a bucket lock for write operations and
updates the Table cache for the respective table on which the op is run on.
Apart from updating the cache it also adds the OMResponse to the DoubleBuffer
which is operated on by an always running daemon thread that calls
OzoneManagerDoubleBuffer#flush every time a new entry is added to the buffer.
However OM doesn't wait for the op to be flushed to return the response to the
client instead it sends the response as soon as the cache And Double buffer is
updated, and bucket lock is not taken during actual put to DB.
This addition to cache is useful as it protects subsequent reads from accessing
the DB where the flush might not have happened.
Eg
*Rename Key*
t1-> Update renamed Key in Cache and add response to DoubleBuffer and response
is returned and bucket lock released but RocksDB is not yet updated with
renamed value.
t2 -> Client issues read on the renamed path, During read it first check cache
and sees renamed value.
t3-> Updates the renamed value to actual DB and cleans up cache.
However for *CreateSnapshot*
t1-> Update snapshot info in Cache and return response to the client after
bucket lock released.
t2-> Client issues read on the renamed path, During read it finds the Snapshot
info object and the checkpoint dir location, however the checkpoint is actually
created when actual RocksDB is updated in OMCreateSnapshotResponse#addToDBBatch
and the read fails
t3-> Add snapshotInfo to DB and create checkpoint.
One of the fixes here I can think of is to wait for the checkpoint dir creation
during read if the snapshot info is in cache.
Another solution would also be to create checkpoint in validateAndUpdateCache
itself
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]