[
https://issues.apache.org/jira/browse/HDDS-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17796566#comment-17796566
]
Sammi Chen edited comment on HDDS-9342 at 12/14/23 6:14 AM:
------------------------------------------------------------
[~sumitagrawl] , the current implementation already keeps the last applied
txid in DB for each DB flush. Any transactions with smaller index than last
applied txid will not be applied by design(The issue mentioned in this Jira is
because the code bug).
Besides OM, RATIS also has a {color:#9876aa}lastAppliedTermIndex field to keep
the last applied txid{color}. This lastAppliedTermIndex is used in many places
in RATIS. Without ratisTransactionMap to track the ratis transactions,
lastAppliedTermIndex will be not always accurate. Evaluate how much impact an
inaccurate lastAppliedTermIndex to the RATIS logic will be an important factor
to decide whether ratisTransactionMap can be removed or not.
was (Author: sammi):
[~sumitagrawl] , the current implementation already keeps the last applied
txid in DB for each DB flush. Any transactions with smaller index than last
applied txid will not be applied by design(The issue mentioned in this Jira is
because the code bug).
Besides OM, RATIS also has a {color:#9876aa}lastAppliedTermIndex field to keep
the last applied txid{color}. This lastAppliedTermIndex is used in many places
in RATIS. Without ratisTransactionMap to track the ratis transactions,
lastAppliedTermIndex will be not always accurate. Don't know much impact it
will be to the RATIS logic.
> OM restart failed due to transactionLogIndex smaller than current updateID
> --------------------------------------------------------------------------
>
> Key: HDDS-9342
> URL: https://issues.apache.org/jira/browse/HDDS-9342
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM, OM HA
> Affects Versions: 1.3.0
> Reporter: Hongbing Wang
> Assignee: Sammi Chen
> Priority: Critical
> Attachments: HDDS-9342_testUpdateId.patch,
> HDDS-9342_testUpdateId_reproduce.patch, clipboard_image_1700795744614.png,
> om.shutdown-20230922.log
>
>
> OM restart failed, log as follow:
> create failed:
> {noformat}
> java.lang.IllegalArgumentException: Trying to set updateID to 2901863625
> which is not greater than the current value of 2901863627 for
> OMKeyInfo{volume='vol-xxx', bucket='xxx', key='user/xxx/platform/xxx',
> dataSize='268435456', creationTime='1695088210914',
> objectID='-9223371293977687808', parentID='0', replication='RATIS/THREE',
> fileChecksum='null}
> at
> org.apache.hadoop.ozone.om.helpers.WithObjectID.setUpdateID(WithObjectID.java:105)
> at
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareFileInfo(OMKeyRequest.java:665)
> at
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareKeyInfo(OMKeyRequest.java:623)
> at
> org.apache.hadoop.ozone.om.request.file.OMFileCreateRequest.validateAndUpdateCache(OMFileCreateRequest.java:255)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:311)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRouterRequestHandler.handleWriteRequest(OzoneManagerRouterRequestHandler.java:806)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:535)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:326)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> rename failed:
> {noformat}
> java.lang.IllegalArgumentException: Trying to set updateID to 2901863669
> which is not greater than the current value of 3076345041 for
> OMKeyInfo{volume='vol-xxx', bucket='xxx', key='checkative/xxx',
> dataSize='23124', creationTime='1695380440059',
> objectID='-9223371249310446848', parentID='0', replication='RATIS/THREE',
> fileChecksum='null}
> at
> org.apache.hadoop.ozone.om.helpers.WithObjectID.setUpdateID(WithObjectID.java:105)
> at
> org.apache.hadoop.ozone.om.request.key.OMKeyRenameRequest.validateAndUpdateCache(OMKeyRenameRequest.java:190)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:311)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRouterRequestHandler.handleWriteRequest(OzoneManagerRouterRequestHandler.java:806)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:535)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:326)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]