[ 
https://issues.apache.org/jira/browse/HDDS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemant Kumar updated HDDS-10590:
--------------------------------
    Description: 
SstFilteringService marks the sstFilter flag for a snapshot and updates the 
RocksDB, 
[code|https://github.com/apache/ozone/blob/2f2234c7b61714404399ada8f31b3fb4772b613a/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java#L123].
 It may cause snapshot chain corruption (probably not after the fix: 
https://github.com/apache/ozone/pull/6443) or data inconsistency due to race 
condition because SstFilteringService updates snapshot info in parallel to 
SnapshotPurge and SnapshotProperty which also update snapshot info. Even tho 
SstFilteringService takes a lock before updating snapshotInfo, SnapshotPurge 
and SnapshotProperty APIs don't take a lock and rely on OMStateMachince because 
OMStateMachince is going to process each request sequentially.
In my opinion, each update to snapshotInfoTable should go through API but that 
is not possible for *SstFilteringService* because *SstFilteringService* runs on 
each OM independently. Hence it is directly updating snapshotInfoTable.
So we need to introduce the lock for SnapshotPurge and SetSnapshotProperty APIs 
unless there is another better way.



  was:
SstFilteringService marks the sstFilter flag for a snapshot and updates the 
RocksDB, 
[code|https://github.com/apache/ozone/blob/2f2234c7b61714404399ada8f31b3fb4772b613a/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java#L123].
 It may cause snapshot chain corruption (probably not after the fix: 
https://github.com/apache/ozone/pull/6443) or data inconsistency due to race 
condition because SstFilteringService updates snapshot info in parallel to 
SnapshotPurge and SnapshotProperty which also update snapshot info. Even tho 
SstFilteringService takes a lock before updating snapshotInfo, SnapshotPurge 
and SnapshotProperty APIs don't take a lock and rely on OMStateMachince because 
OMStateMachince is going to process each request sequentially.
In my opinion, each update to snapshotInfoTable should go through API but that 
is not possible for `SstFilteringService` because `SstFilteringService` runs on 
each OM independently. Hence it is directly updating snapshotInfoTable.
So we need to introduce the lock for SnapshotPurge and SetSnapshotProperty APIs 
unless there is another better way.




> SstFilteringService updating snapshotInfo directly
> --------------------------------------------------
>
>                 Key: HDDS-10590
>                 URL: https://issues.apache.org/jira/browse/HDDS-10590
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Hemant Kumar
>            Priority: Major
>
> SstFilteringService marks the sstFilter flag for a snapshot and updates the 
> RocksDB, 
> [code|https://github.com/apache/ozone/blob/2f2234c7b61714404399ada8f31b3fb4772b613a/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java#L123].
>  It may cause snapshot chain corruption (probably not after the fix: 
> https://github.com/apache/ozone/pull/6443) or data inconsistency due to race 
> condition because SstFilteringService updates snapshot info in parallel to 
> SnapshotPurge and SnapshotProperty which also update snapshot info. Even tho 
> SstFilteringService takes a lock before updating snapshotInfo, SnapshotPurge 
> and SnapshotProperty APIs don't take a lock and rely on OMStateMachince 
> because OMStateMachince is going to process each request sequentially.
> In my opinion, each update to snapshotInfoTable should go through API but 
> that is not possible for *SstFilteringService* because *SstFilteringService* 
> runs on each OM independently. Hence it is directly updating 
> snapshotInfoTable.
> So we need to introduce the lock for SnapshotPurge and SetSnapshotProperty 
> APIs unless there is another better way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to