[ 
https://issues.apache.org/jira/browse/HDDS-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-9876:
-----------------------------
    Description: 
This task is to resolve the issues in HDDS-9342.


 HDDS-2680 introduced a logic in OzoneManagerStateMachine to calculate the 
lastAppliedTermIndex based on two maps, applyTransactionMap and 
ratisTransactionMap. Any write request from RATIS through applyTransaction will 
add its trxLogIndex into applyTransactionMap. And any write request which is 
flushed by OzoneManagerDoubleBuffer#flushBatch will have its trxLogIndex 
removed from applyTransactionMap during flushBatch call 
ozoneManagerRatisSnapShot.updateLastAppliedIndex(flushedEpochs).

If any write request from RATIS not going through 
OzoneManagerDoubleBuffer#flushBatch, then its trxLogIndex will be left in the
applyTransactionMap forever. Since lastApplicedIndex can only be updated 
incrementally, any trxLogIndex not confirmed by OzoneManagerDoubleBuffer flush 
will make the lastApplicedIndex grow stops before it, and although write 
requests after that unconfirmed one could be flushed, but their trxLogIndex 
will be added to the ratisTransactionMap, which causes the ratisTransactionMap 
grow bigger and bigger. These explains the full GC, the huge gap between 
snapshot index and commit index.

How a write request will not be confirmed by OzoneManagerDoubleBuffer flush? 
Here is one case reproduced locally.
T1: create bucket1
T2: client1 sends delete bucket "bucket1" request to OM. OM verify bucket1 
exists, then send request to RATIS to handle the request.
T3: client2 sends create key "bucket1/key1" request to OM. OM verify bucket2 
exists, then send request to RATIS
T4: OzoneManagerStateMachine executes delete bucket "bucket1" success, return 
response to client1
T5: OzoneManagerStateMachine executes create key "bucket1/key1" request, 
"bucket1" cannot be found, execution fails, return failure to client2

In T5, the failure stack is

2023-10-18 19:04:10,131 [OM StateMachine ApplyTransaction Thread - 0] WARN 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Failed to write, 
Exception occurred 
BUCKET_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Bucket not 
found: s3v/prod-voyager
        at 
org.apache.hadoop.ozone.om.OzoneManagerUtils.reportNotFound(OzoneManagerUtils.java:87)
        at 
org.apache.hadoop.ozone.om.OzoneManagerUtils.getBucketInfo(OzoneManagerUtils.java:72)
        at 
org.apache.hadoop.ozone.om.OzoneManagerUtils.resolveBucketInfoLink(OzoneManagerUtils.java:148)
        at 
org.apache.hadoop.ozone.om.OzoneManagerUtils.getResolvedBucketInfo(OzoneManagerUtils.java:124)
        at 
org.apache.hadoop.ozone.om.OzoneManagerUtils.getBucketLayout(OzoneManagerUtils.java:106)
        at 
org.apache.hadoop.ozone.om.request.BucketLayoutAwareOMKeyRequestFactory.createRequest(BucketLayoutAwareOMKeyRequestFactory.java:230)
        at 
org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerRatisUtils.createClientRequest(OzoneManagerRatisUtils.java:336)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:380)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:572)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:362)
        at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

In OzoneManagerStateMachine.runCommand, when IOException is throw out from 
OzoneManagerRequestHandler.handleWriteRequest, it constructs and returns 
OMResponse to client, it doesn't add the response into 
OzoneManagerDoubleBuffer, so OzoneManagerDoubleBuffer doesn't aware of this 
request and its trxLogIndex. The consequence is this trxLogIndex will be stay 
in applyTransactionMap forever.  

  was:This task is to resolve the issues in HDDS-9342. 


> OzoneManagerStateMachine should put all failed write requests into 
> OzoneManagerDoubleBuffer  
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDDS-9876
>                 URL: https://issues.apache.org/jira/browse/HDDS-9876
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>
> This task is to resolve the issues in HDDS-9342.
>  HDDS-2680 introduced a logic in OzoneManagerStateMachine to calculate the 
> lastAppliedTermIndex based on two maps, applyTransactionMap and 
> ratisTransactionMap. Any write request from RATIS through applyTransaction 
> will add its trxLogIndex into applyTransactionMap. And any write request 
> which is flushed by OzoneManagerDoubleBuffer#flushBatch will have its 
> trxLogIndex removed from applyTransactionMap during flushBatch call 
> ozoneManagerRatisSnapShot.updateLastAppliedIndex(flushedEpochs).
> If any write request from RATIS not going through 
> OzoneManagerDoubleBuffer#flushBatch, then its trxLogIndex will be left in the
> applyTransactionMap forever. Since lastApplicedIndex can only be updated 
> incrementally, any trxLogIndex not confirmed by OzoneManagerDoubleBuffer 
> flush will make the lastApplicedIndex grow stops before it, and although 
> write requests after that unconfirmed one could be flushed, but their 
> trxLogIndex will be added to the ratisTransactionMap, which causes the 
> ratisTransactionMap grow bigger and bigger. These explains the full GC, the 
> huge gap between snapshot index and commit index.
> How a write request will not be confirmed by OzoneManagerDoubleBuffer flush? 
> Here is one case reproduced locally.
> T1: create bucket1
> T2: client1 sends delete bucket "bucket1" request to OM. OM verify bucket1 
> exists, then send request to RATIS to handle the request.
> T3: client2 sends create key "bucket1/key1" request to OM. OM verify bucket2 
> exists, then send request to RATIS
> T4: OzoneManagerStateMachine executes delete bucket "bucket1" success, return 
> response to client1
> T5: OzoneManagerStateMachine executes create key "bucket1/key1" request, 
> "bucket1" cannot be found, execution fails, return failure to client2
> In T5, the failure stack is
> 2023-10-18 19:04:10,131 [OM StateMachine ApplyTransaction Thread - 0] WARN 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Failed to write, 
> Exception occurred 
> BUCKET_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Bucket 
> not found: s3v/prod-voyager
>       at 
> org.apache.hadoop.ozone.om.OzoneManagerUtils.reportNotFound(OzoneManagerUtils.java:87)
>       at 
> org.apache.hadoop.ozone.om.OzoneManagerUtils.getBucketInfo(OzoneManagerUtils.java:72)
>       at 
> org.apache.hadoop.ozone.om.OzoneManagerUtils.resolveBucketInfoLink(OzoneManagerUtils.java:148)
>       at 
> org.apache.hadoop.ozone.om.OzoneManagerUtils.getResolvedBucketInfo(OzoneManagerUtils.java:124)
>       at 
> org.apache.hadoop.ozone.om.OzoneManagerUtils.getBucketLayout(OzoneManagerUtils.java:106)
>       at 
> org.apache.hadoop.ozone.om.request.BucketLayoutAwareOMKeyRequestFactory.createRequest(BucketLayoutAwareOMKeyRequestFactory.java:230)
>       at 
> org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerRatisUtils.createClientRequest(OzoneManagerRatisUtils.java:336)
>       at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:380)
>       at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:572)
>       at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:362)
>       at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> In OzoneManagerStateMachine.runCommand, when IOException is throw out from 
> OzoneManagerRequestHandler.handleWriteRequest, it constructs and returns 
> OMResponse to client, it doesn't add the response into 
> OzoneManagerDoubleBuffer, so OzoneManagerDoubleBuffer doesn't aware of this 
> request and its trxLogIndex. The consequence is this trxLogIndex will be stay 
> in applyTransactionMap forever.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to