[ 
https://issues.apache.org/jira/browse/HDDS-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swaminathan Balachandran updated HDDS-11443:
--------------------------------------------
    Description: 
An addCacheEntry call directly updates the cache. There is no rollback strategy 
in place, which would rollback the cache data structure before the ratis 
transaction when an error response is returned. If the cache is updated and an 
error response is returned, the cache could be in an inconsistent state.

E.g. 

[https://github.com/apache/ozone/blob/831cd46ff48d75c0ea4f1c21f907deebbb5f3262/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java#L130-L133]
 P.S. there could be happening in other requests as well.

The proposal is to have an uncommitted cache map which would be used to add the 
delta entries. These entries would be merged to the main cache map if a success 
response is returned otherwise the entries would be scrapped away. 

The bigger problem is that, this could impact reads as well. Since 
readFromCache could be reading from intermediate transaction data which could 
be returning inconsistent results. Read from cache should only send commited 
transaction data.

[~erose] [~hemantk] Correct me if I am wrong here.

cc: [~umamahesh] 

  was:
An addCacheEntry call directly updates the cache, there is no rollback strategy 
in place, which would rollback the cache data structure before the ratis 
transaction when an error response is returned. If the cache is updated and an 
error response is returned, the cache could be in an inconsistent state.

E.g. 

https://github.com/apache/ozone/blob/831cd46ff48d75c0ea4f1c21f907deebbb5f3262/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java#L130-L133
 P.S. there could be happening in other requests as well.

The proposal is to have an uncommitted cache map which would be used to add the 
delta entries. These entries would be merged to the main cache map if a success 
response is returned otherwise the entries would be scrapped away. 

The bigger problem is that, this could impact reads as well. Since 
readFromCache could be reading from intermediate transaction data which could 
be returning inconsistent results. Read from cache should only send commited 
transaction data.

[~erose] [~hemantk] Correct me if I am wrong here.

cc: [~umamahesh] 


> Ozone Manager cache cannot rollback cache updates made by 
> validateAndUpdateCache in case of error response
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-11443
>                 URL: https://issues.apache.org/jira/browse/HDDS-11443
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Swaminathan Balachandran
>            Assignee: Swaminathan Balachandran
>            Priority: Major
>
> An addCacheEntry call directly updates the cache. There is no rollback 
> strategy in place, which would rollback the cache data structure before the 
> ratis transaction when an error response is returned. If the cache is updated 
> and an error response is returned, the cache could be in an inconsistent 
> state.
> E.g. 
> [https://github.com/apache/ozone/blob/831cd46ff48d75c0ea4f1c21f907deebbb5f3262/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java#L130-L133]
>  P.S. there could be happening in other requests as well.
> The proposal is to have an uncommitted cache map which would be used to add 
> the delta entries. These entries would be merged to the main cache map if a 
> success response is returned otherwise the entries would be scrapped away. 
> The bigger problem is that, this could impact reads as well. Since 
> readFromCache could be reading from intermediate transaction data which could 
> be returning inconsistent results. Read from cache should only send commited 
> transaction data.
> [~erose] [~hemantk] Correct me if I am wrong here.
> cc: [~umamahesh] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to