[ 
https://issues.apache.org/jira/browse/HDDS-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502640#comment-17502640
 ] 

Siyao Meng commented on HDDS-6261:
----------------------------------

Thanks for the comment [~kuenishi].

Would you elaborate on the "resetting the number may harm ordering of versions" 
part? I'm not sure I get it.

IMO with HDDS-5461 currently, old versions are already cleared when bucket 
versioning is disabled (which is the default). Resetting the version to zero 
doesn't seem harmful in this case. Unless we are intending to track the number 
of times a key is overwritten even when bucket versioning is disabled?

> OM crashes when trying to overwrite a key during upgrade downgrade testing
> --------------------------------------------------------------------------
>
>                 Key: HDDS-6261
>                 URL: https://issues.apache.org/jira/browse/HDDS-6261
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager
>    Affects Versions: 1.3.0
>            Reporter: Siyao Meng
>            Assignee: Siyao Meng
>            Priority: Blocker
>              Labels: pull-request-available
>
> While working on HDDS-6084 (related to upgrade acceptance testing), [~erose] 
> and I found that if:
> 1) a key is created with a new OM version (1.3.0)
> 2) downgrade to OM 1.1.0
> 3) try to overwrite the key created in (1)
> Step (3) will result in all 3 OMs crashing.
> The issue seems to be introduced in the unreleased branch of 1.3.0. The same 
> issue cannot be reproduced in 1.1.0 to 1.2.0 upgrade/downgrade testing (which 
> should exclude HDDS-5243 or HDDS-5393 as a potential root cause). This could 
> indicate some unreleased changes has broken the key versioning after the 
> downgrade. (Could well be an incompatible change. Need further investigation.)
> {code}
> om2_1    | 2022-02-03 21:36:15,228 [OM StateMachine ApplyTransaction Thread - 
> 0] ERROR ratis.OzoneManagerStateMachine: Terminating with exit status 1: 
> Request cmdType: CreateKey
> om2_1    | traceID: ""
> om2_1    | clientId: "client-72B024AF247D"
> om2_1    | userInfo {
> om2_1    |   userName: "dlfknslnfslf"
> om2_1    |   remoteAddress: "10.9.0.19"
> om2_1    |   hostName: "ha_s3g_1.ha_net"
> om2_1    | }
> om2_1    | version: 1
> om2_1    | createKeyRequest {
> om2_1    |   keyArgs {
> om2_1    |     volumeName: "s3v"
> om2_1    |     bucketName: "old1-bucket"
> om2_1    |     keyName: "key2"
> om2_1    |     dataSize: 17539
> om2_1    |     type: RATIS
> om2_1    |     factor: THREE
> om2_1    |     keyLocations {
> om2_1    |       blockID {
> om2_1    |         containerBlockID {
> om2_1    |           containerID: 1
> om2_1    |           localID: 107736214721200128
> om2_1    |         }
> om2_1    |         blockCommitSequenceId: 0
> om2_1    |       }
> om2_1    |       offset: 0
> om2_1    |       length: 268435456
> om2_1    |       createVersion: 0
> om2_1    |       pipeline {
> om2_1    |         members {
> om2_1    |           uuid: "b92bf4c8-3b0c-40b0-bb2b-05b6d3594e13"
> om2_1    |           ipAddress: "10.9.0.16"
> om2_1    |           hostName: "ha_dn2_1.ha_net"
> om2_1    |           ports {
> om2_1    |             name: "REPLICATION"
> om2_1    |             value: 9886
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS"
> om2_1    |             value: 9858
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_ADMIN"
> om2_1    |             value: 9857
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_SERVER"
> om2_1    |             value: 9856
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "STANDALONE"
> om2_1    |             value: 9859
> om2_1    |           }
> om2_1    |           networkName: "b92bf4c8-3b0c-40b0-bb2b-05b6d3594e13"
> om2_1    |           networkLocation: "/default-rack"
> om2_1    |           persistedOpState: IN_SERVICE
> om2_1    |           persistedOpStateExpiry: 0
> om2_1    |           uuid128 {
> om2_1    |             mostSigBits: -5103716611873029968
> om2_1    |             leastSigBits: -4959864281830437357
> om2_1    |           }
> om2_1    |         }
> om2_1    |         members {
> om2_1    |           uuid: "f0b7e615-d4ee-4ec4-a6b5-ec68b82c07e9"
> om2_1    |           ipAddress: "10.9.0.15"
> om2_1    |           hostName: "ha_dn1_1.ha_net"
> om2_1    |           ports {
> om2_1    |             name: "REPLICATION"
> om2_1    |             value: 9886
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS"
> om2_1    |             value: 9858
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_ADMIN"
> om2_1    |             value: 9857
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_SERVER"
> om2_1    |             value: 9856
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "STANDALONE"
> om2_1    |             value: 9859
> om2_1    |           }
> om2_1    |           networkName: "f0b7e615-d4ee-4ec4-a6b5-ec68b82c07e9"
> om2_1    |           networkLocation: "/default-rack"
> om2_1    |           persistedOpState: IN_SERVICE
> om2_1    |           persistedOpStateExpiry: 0
> om2_1    |           uuid128 {
> om2_1    |             mostSigBits: -1101158602427707708
> om2_1    |             leastSigBits: -6433976558118238231
> om2_1    |           }
> om2_1    |         }
> om2_1    |         members {
> om2_1    |           uuid: "c7912312-811d-469d-8c40-c739cd2a1e62"
> om2_1    |           ipAddress: "10.9.0.17"
> om2_1    |           hostName: "ha_dn3_1.ha_net"
> om2_1    |           ports {
> om2_1    |             name: "REPLICATION"
> om2_1    |             value: 9886
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS"
> om2_1    |             value: 9858
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_ADMIN"
> om2_1    |             value: 9857
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "RATIS_SERVER"
> om2_1    |             value: 9856
> om2_1    |           }
> om2_1    |           ports {
> om2_1    |             name: "STANDALONE"
> om2_1    |             value: 9859
> om2_1    |           }
> om2_1    |           networkName: "c7912312-811d-469d-8c40-c739cd2a1e62"
> om2_1    |           networkLocation: "/default-rack"
> om2_1    |           persistedOpState: IN_SERVICE
> om2_1    |           persistedOpStateExpiry: 0
> om2_1    |           uuid128 {
> om2_1    |             mostSigBits: -4066430426156284259
> om2_1    |             leastSigBits: -8340447458821005726
> om2_1    |           }
> om2_1    |         }
> om2_1    |         state: PIPELINE_OPEN
> om2_1    |         type: RATIS
> om2_1    |         factor: THREE
> om2_1    |         id {
> om2_1    |           id: "c0b6f272-9a39-4dc3-ada8-c3b833cc6e17"
> om2_1    |           uuid128 {
> om2_1    |             mostSigBits: -4560190998638408253
> om2_1    |             leastSigBits: -5933277313150194153
> om2_1    |           }
> om2_1    |         }
> om2_1    |         leaderID: "f0b7e615-d4ee-4ec4-a6b5-ec68b82c07e9"
> om2_1    |         creationTimeStamp: 1643924132125
> om2_1    |         suggestedLeaderID {
> om2_1    |           mostSigBits: -1101158602427707708
> om2_1    |           leastSigBits: -6433976558118238231
> om2_1    |         }
> om2_1    |         leaderID128 {
> om2_1    |           mostSigBits: -1101158602427707708
> om2_1    |           leastSigBits: -6433976558118238231
> om2_1    |         }
> om2_1    |       }
> om2_1    |       partNumber: 0
> om2_1    |     }
> om2_1    |     isMultipartKey: false
> om2_1    |     acls {
> om2_1    |       type: USER
> om2_1    |       name: "dlfknslnfslf"
> om2_1    |       rights: "\200"
> om2_1    |       aclScope: ACCESS
> om2_1    |     }
> om2_1    |     modificationTime: 1643924174840
> om2_1    |   }
> om2_1    |   clientID: 107736214722445312
> om2_1    | }
> om2_1    | failed with exception
> om2_1    | java.lang.IllegalArgumentException
> om2_1    |    at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.helpers.OmKeyInfo.<init>(OmKeyInfo.java:81)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.helpers.OmKeyInfo$Builder.build(OmKeyInfo.java:378)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.helpers.OmKeyInfo.getFromProtobuf(OmKeyInfo.java:460)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.codec.OmKeyInfoCodec.fromPersistedFormat(OmKeyInfoCodec.java:59)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.codec.OmKeyInfoCodec.fromPersistedFormat(OmKeyInfoCodec.java:36)
> om2_1    |    at 
> org.apache.hadoop.hdds.utils.db.CodecRegistry.asObject(CodecRegistry.java:55)
> om2_1    |    at 
> org.apache.hadoop.hdds.utils.db.TypedTable.getFromTableIfExist(TypedTable.java:261)
> om2_1    |    at 
> org.apache.hadoop.hdds.utils.db.TypedTable.getIfExist(TypedTable.java:248)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:236)
> om2_1    |    at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:227)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:415)
> om2_1    |    at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$applyTransaction$1(OzoneManagerStateMachine.java:240)
> om2_1    |    at 
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
> om2_1    |    at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> om2_1    |    at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> om2_1    |    at java.base/java.lang.Thread.run(Thread.java:834)
> om2_1    | 2022-02-03 21:36:15,253 [shutdown-hook-0] INFO 
> om.OzoneManagerStarter: SHUTDOWN_MSG:
> om2_1    | /************************************************************
> om2_1    | SHUTDOWN_MSG: Shutting down OzoneManager at a250845831a7/10.9.0.12
> om2_1    | ************************************************************/
> {code}
> This is where OM is crashing in ozone 1.1.0 code:
> https://github.com/apache/ozone/blob/ozone-1.1.0/hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java#L81-L82



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to