[
https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ke Han updated HBASE-28583:
---------------------------
Summary: Upgrade from 2.5.8 to 3.0.0 crash with
InvalidProtocolBufferException: Message missing required fields:
old_table_schema (was: Upgrade from 2.5.8 to 3.0 crash with
InvalidProtocolBufferException: Message missing required fields:
old_table_schema)
> Upgrade from 2.5.8 to 3.0.0 crash with InvalidProtocolBufferException:
> Message missing required fields: old_table_schema
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-28583
> URL: https://issues.apache.org/jira/browse/HBASE-28583
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 3.0.0, 2.5.8
> Reporter: Ke Han
> Priority: Major
> Attachments: hbase--master-033a47be7d1d.log, persistent.tar.gz
>
>
> When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2
> RS, 2 HDFS), I met the following exception and the upgrade failed.
> {code:java}
> 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster]
> master.HMaster: Failed to become active master
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException:
> Message missing required fields: old_table_schema
> at
> org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
> ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
> 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster]
> master.HMaster: ***** ABORTING master hmaster,16000,1715302475720: Unhandled
> exception. Starting shutdown. *****
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException:
> Message missing required fields: old_table_schema
> at
> org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118)
> ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666)
> ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
> ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
> {code}
> h1. Reproduce
> This bug can be reproduced deterministically with the following steps:
> Start up HBase 2.5.8 cluster (1 HM, 2 RS, 1 HDFS: hadoop 2.10.2)
> Execute the following commands
> {code:java}
> create 'tb1', {NAME => 'c0', VERSIONS => 1}
> snapshot 'tb1', 's1'
> disable 'tb1'
> restore_snapshot 's1' {code}
> Stop the 2.5.8 cluster, then start up 3.0.0 cluster (commit: 516c89e8597fb6)
> The upgrade will fail with the above exception.
> h1. Root Cause
> This incompatibility between 2.5.8 and 3.0.0 is related to a newly added
> *required* field in proto file: _{*}old_table_schema{*}._
> 2.5.8
> {code:java}
> hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
> message RestoreSnapshotStateData {
> required UserInformation user_info = 1;
> required SnapshotDescription snapshot = 2;
> required TableSchema modified_table_schema = 3;
> repeated RegionInfo region_info_for_restore = 4;
> repeated RegionInfo region_info_for_remove = 5;
> repeated RegionInfo region_info_for_add = 6;
> repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list
> = 7;
> optional bool restore_acl = 8;
> }{code}
> 3.0.0 (516c89e8597fb6)
> {code:java}
> message RestoreSnapshotStateData {
> required UserInformation user_info = 1;
> required SnapshotDescription snapshot = 2;
> required TableSchema modified_table_schema = 3;
> repeated RegionInfo region_info_for_restore = 4;
> repeated RegionInfo region_info_for_remove = 5;
> repeated RegionInfo region_info_for_add = 6;
> repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list
> = 7;
> optional bool restore_acl = 8;
> required TableSchema old_table_schema = 9;
> } {code}
> In certain scenarios, the proto message does not contain the old_table_schema
> field.
> I am wondering whether *_old_table_schema_* field must be set as required.
>
> I attached the (1) master logs file and (2) all log files in
> persistent.tar.gz.
> I am trying to find out the root cause. I appreciate any suggestions. Thank
> you!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)