[
https://issues.apache.org/jira/browse/HDDS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hemant Kumar updated HDDS-8390:
-------------------------------
Status: Patch Available (was: In Progress)
> Synchronization between Snapshot Deletes/GC and other Snapshot jobs
> (read/diff)
> -------------------------------------------------------------------------------
>
> Key: HDDS-8390
> URL: https://issues.apache.org/jira/browse/HDDS-8390
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: Snapshot
> Reporter: Prashant Pogde
> Assignee: Hemant Kumar
> Priority: Major
> Labels: pull-request-available
> Attachments: 35fdc3bd-cd0c-40f3-8fd7-2d8a8dc4643d.pdf
>
>
> We need to have proper synchronization between Snapshot delete/GC and other
> Snapshot jobs e.g. reads from Snapshots and Snapdiff. Snapdiff is
> particularly important case since it could be a long running job and in the
> middle of the job, Snapshot delete/GC can kick in.
> We should also have a uniform behavior in the cluster in case of a failover
> and concurrent Snap-diff/Deletes. It should not happen that a leader OM node
> returns certain result to a client but after a failover the new OM leader
> returns different result.
> ---
> Thus, in order to prevent client from getting partial SnapDiff result without
> the client even realizing it, and to avoid explicitly holding lock, we would
> want to use an approach similar to optimistic locking, by checking whether
> the snapshot is still ACTIVE towards the end of the request lifetime when
> SnapDiff service has already collected all the batch entires in a buffer. See
> the attachment for a timeline of potential race condition:
> [^35fdc3bd-cd0c-40f3-8fd7-2d8a8dc4643d.pdf]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]