[jira] [Commented] (HBASE-28583) Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881918#comment-17881918 ] Ke Han commented on HBASE-28583: Hi [Duo Zhang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhangduo], May I ask whether it's necessary to set old_table_schema to required? It crashes the upgrade process as long as there's a protobuf message of RestoreSnapshotStateData from the old version data. I feel it should be set to optional for backward compatibility. If that's the case, I can provide a PR to fix it. Thank you! > Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.9 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-e19f64f2bc73.log > > > When migrating data from 2.5.9 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-bet
[jira] [Comment Edited] (HBASE-28583) Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881918#comment-17881918 ] Ke Han edited comment on HBASE-28583 at 9/16/24 5:25 AM: - Hi [Duo Zhang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhangduo], May I ask whether it's necessary to set old_table_schema to be {_}required{_}? It crashes the upgrade process as long as there's a protobuf message of RestoreSnapshotStateData from the old version data. I feel it should be set to be _optional_ for backward compatibility. If that's the case, I can provide a PR to fix it. Thank you! was (Author: JIRAUSER289562): Hi [Duo Zhang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhangduo], May I ask whether it's necessary to set old_table_schema to required? It crashes the upgrade process as long as there's a protobuf message of RestoreSnapshotStateData from the old version data. I feel it should be set to optional for backward compatibility. If that's the case, I can provide a PR to fix it. Thank you! > Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.9 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-e19f64f2bc73.log > > > When migrating data from 2.5.9 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SN
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Description: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.9 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table', 'row1', 'cf1:cell', 2 flush 'table', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds {code} According to the _flush (flush.rb)_ command specification, user can flush a specific column family. {code:java} Flush all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} In 2.6.0, the flush command would stuck and run into NPE {code:java} java.lang.NullPointerException: null at org.apache.hadoop.hbase.regionserver.HRegion.logFatLineOnFlush(HRegion.java:2724) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2640) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2587) ~[hbase-server-2.6.0.jar:2.6.0] {code} h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. was: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table', 'row1', 'cf1:cell', 2 flush 'table', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Affects Version/s: 2.6.0 > NPE when flushing a non-existing column family > -- > > Key: HBASE-28187 > URL: https://issues.apache.org/jira/browse/HBASE-28187 > Project: HBase > Issue Type: Bug >Affects Versions: 2.6.0, 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Labels: pull-request-available > > Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in > both shell and the HMaster logs. > h1. Reproduce > Start up HBase 2.5.5 cluster, executing the following commands with hbase > shell in HMaster node will lead to NPE. (Can be reproduced determinstically) > {code:java} > create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > incr 'table', 'row1', 'cf1:cell', 2 > flush 'table', 'cf3'{code} > The shell outputs > {code:java} > hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => > 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > Created table table > Took 2.1238 seconds > > => Hbase::Table - table > hbase:007:0> > hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 > COUNTER VALUE = 2 > Took 0.0131 seconds > > hbase:009:0> > hbase:010:0> flush 'table', 'cf3' > ERROR: java.io.IOException: java.lang.NullPointerException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > Caused by: > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > For usage try 'help "flush"' > Took 12.1713 seconds > {code} > > According to the _flush (flush.rb)_ command specification, user can flush a > specific column family. > {code:java} > Flush all regions in passed table or pass a region row to > flush an individual region or a region server name whose format > is 'host,port,startcode', to flush all its regions. > You can also flush a single column family for all regions within a table, > or for an specific region only. > For example: > hbase> flush 'TABLENAME' > hbase> flush 'TABLENAME','FAMILYNAME' {code} > In the above case, *cf3* an incorrect input (non-existing column family). If > user tries to flush it, the expected output is: > # HBase rejects this operation > # returns a prompt saying the column family doesn't exist > {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} > h1. Root Cause > There's a missing check for the whether the target flushing columnfamily > exists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880393#comment-17880393 ] Ke Han commented on HBASE-28812: [Duo Zhang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhangduo] Thank you for the PR! I have applied the patch to a030e80998, and it's working. > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: compatibility >Affects Versions: 3.0.0 >Reporter: Ke Han >Assignee: Duo Zhang >Priority: Major > Labels: pull-request-available, upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7749) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:277) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:432) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.lang.Thread.run(Thread.java:833) ~[?:?] > Caused by: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:289) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:339) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at >
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: hbase--master-033a47be7d1d.log) > Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-e19f64f2bc73.log > > > When migrating data from 2.5.9 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: hbase--master-e19f64f2bc73.log > Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-e19f64f2bc73.log > > > When migrating data from 2.5.9 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Summary: Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema (was: Upgrade from 2.5.8 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema) > Upgrade from 2.5.9 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-033a47be7d1d.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] >
[jira] [Updated] (HBASE-28815) Upgrade from 1.7.2 to 2.6.0 failed: HMaster aborted
[ https://issues.apache.org/jira/browse/HBASE-28815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28815: --- Component/s: master > Upgrade from 1.7.2 to 2.6.0 failed: HMaster aborted > --- > > Key: HBASE-28815 > URL: https://issues.apache.org/jira/browse/HBASE-28815 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.6.0 >Reporter: Ke Han >Priority: Major > > I am trying to migrate from 1.7.2 cluster to 2.6.0 (both are released > versions). However, I observed that the hmaster crashed during the upgrade > process. > h1. Reproduce > Step1: Start up 1.7.2 HBase cluster (1 HDFS, 1 HM, 1 RS). > Step2: Stop the 1.7.2 HBase cluster. > Step3: Upgrade to 2.6.0 HBase cluster. > HMaster will crash with the following exception > {code:java} > 2024-09-04T16:04:47,004 WARN [PEWorker-2] procedure.InitMetaProcedure: > Failed to init meta, suspend 1000secs > java.io.IOException: Meta table is not partial, please sideline this meta > directory or run HBCK to fix this meta table, e.g. rebuild the server > hostname in ZNode for the meta region > at > org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.deleteMetaTableDirectoryIfPartial(InitMetaProcedure.java:199) > ~[hbase-server-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.writeFsLayout(InitMetaProcedure.java:78) > ~[hbase-server-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.executeFromState(InitMetaProcedure.java:102) > ~[hbase-server-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.executeFromState(InitMetaProcedure.java:54) > ~[hbase-server-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:944) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1766) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1444) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:77) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:2092) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216) > ~[hbase-common-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2119) > ~[hbase-procedure-2.6.0.jar:2.6.0] > 2024-09-04T16:04:47,005 INFO [PEWorker-2] procedure2.TimeoutExecutorThread: > ADDED pid=1, state=WAITING_TIMEOUT:INIT_META_WRITE_FS_LAYOUT, locked=true; > InitMetaProcedure table=hbase:meta; timeout=1000, timestamp=1725465888005 > 2024-09-04T16:04:48,045 ERROR [PEWorker-1] procedure2.ProcedureExecutor: Root > Procedure pid=1, state=FAILED:INIT_META_WRITE_FS_LAYOUT, > exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via > ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: > Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta does > not support rollback but the execution failed and try to rollback, code bug? > org.apache.hadoop.hbase.procedure2.RemoteProcedureException: > org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out > after 1.0010 sec > at > org.apache.hadoop.hbase.procedure2.Procedure.setFailure(Procedure.java:768) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.Procedure.setTimeoutFailure(Procedure.java:797) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.executeTimedoutProcedure(TimeoutExecutorThread.java:131) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.execDelayedProcedure(TimeoutExecutorThread.java:109) > ~[hbase-procedure-2.6.0.jar:2.6.0] > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.run(TimeoutExecutorThread.java:68) > ~[hbase-procedure-2.6.0.jar:2.6.0] > Caused by: org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation > timed out after 1.0010 sec > at > org.apache.hadoop.hbase.procedure2.Procedure.setTimeoutFailure(Procedure.java:798) > ~[hbase-procedure-2.6.0.jar:2.6.0] > ... 3 more > 2024-09-04T16:04:48,058 INFO [PEWorker-1] procedure2.Procedu
[jira] [Created] (HBASE-28815) Upgrade from 1.7.2 to 2.6.0 failed: HMaster aborted
Ke Han created HBASE-28815: -- Summary: Upgrade from 1.7.2 to 2.6.0 failed: HMaster aborted Key: HBASE-28815 URL: https://issues.apache.org/jira/browse/HBASE-28815 Project: HBase Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ke Han I am trying to migrate from 1.7.2 cluster to 2.6.0 (both are released versions). However, I observed that the hmaster crashed during the upgrade process. h1. Reproduce Step1: Start up 1.7.2 HBase cluster (1 HDFS, 1 HM, 1 RS). Step2: Stop the 1.7.2 HBase cluster. Step3: Upgrade to 2.6.0 HBase cluster. HMaster will crash with the following exception {code:java} 2024-09-04T16:04:47,004 WARN [PEWorker-2] procedure.InitMetaProcedure: Failed to init meta, suspend 1000secs java.io.IOException: Meta table is not partial, please sideline this meta directory or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the meta region at org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.deleteMetaTableDirectoryIfPartial(InitMetaProcedure.java:199) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.writeFsLayout(InitMetaProcedure.java:78) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.executeFromState(InitMetaProcedure.java:102) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.master.procedure.InitMetaProcedure.executeFromState(InitMetaProcedure.java:54) ~[hbase-server-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:944) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1766) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1444) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:77) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:2092) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216) ~[hbase-common-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2119) ~[hbase-procedure-2.6.0.jar:2.6.0] 2024-09-04T16:04:47,005 INFO [PEWorker-2] procedure2.TimeoutExecutorThread: ADDED pid=1, state=WAITING_TIMEOUT:INIT_META_WRITE_FS_LAYOUT, locked=true; InitMetaProcedure table=hbase:meta; timeout=1000, timestamp=1725465888005 2024-09-04T16:04:48,045 ERROR [PEWorker-1] procedure2.ProcedureExecutor: Root Procedure pid=1, state=FAILED:INIT_META_WRITE_FS_LAYOUT, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta does not support rollback but the execution failed and try to rollback, code bug? org.apache.hadoop.hbase.procedure2.RemoteProcedureException: org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec at org.apache.hadoop.hbase.procedure2.Procedure.setFailure(Procedure.java:768) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.Procedure.setTimeoutFailure(Procedure.java:797) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.executeTimedoutProcedure(TimeoutExecutorThread.java:131) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.execDelayedProcedure(TimeoutExecutorThread.java:109) ~[hbase-procedure-2.6.0.jar:2.6.0] at org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.run(TimeoutExecutorThread.java:68) ~[hbase-procedure-2.6.0.jar:2.6.0] Caused by: org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec at org.apache.hadoop.hbase.procedure2.Procedure.setTimeoutFailure(Procedure.java:798) ~[hbase-procedure-2.6.0.jar:2.6.0] ... 3 more 2024-09-04T16:04:48,058 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4160 sec 2024-09-04T16:04:48,059 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.
[jira] [Comment Edited] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879317#comment-17879317 ] Ke Han edited comment on HBASE-28812 at 9/4/24 5:44 PM: [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (The error log looks similar. I attached the failure log: hbase--master-440ed844e077.log) commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeded. commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} was (Author: JIRAUSER289562): [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeded. commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0 >Reporter: Ke Han >Priority: Major > Labels: upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHO
[jira] [Comment Edited] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879317#comment-17879317 ] Ke Han edited comment on HBASE-28812 at 9/4/24 5:43 PM: [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeds commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} was (Author: JIRAUSER289562): [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeds commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0 >Reporter: Ke Han >Priority: Major > Labels: upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.h
[jira] [Comment Edited] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879317#comment-17879317 ] Ke Han edited comment on HBASE-28812 at 9/4/24 5:43 PM: [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeded. commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} was (Author: JIRAUSER289562): [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeds commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0 >Reporter: Ke Han >Priority: Major > Labels: upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.ha
[jira] [Comment Edited] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879317#comment-17879317 ] Ke Han edited comment on HBASE-28812 at 9/4/24 5:42 PM: [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch {code:java} Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang === Upgrade succeeds commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty{code} was (Author: JIRAUSER289562): [~zhangduo] Thank you for the reply! It seems yes. Upgrading from 2.6.0 to the commit before HBASE-28577 will succeed. I tested upgrading from 2.6.0 to the following 2 commits from master branch |Upgrade crashed. (I attached the failure log: hbase--master-440ed844e077.log| commit 419666b8eb8a881724fe6f65e8235a4220824e51 (HEAD) Author: lixiaobao <977734...@qq.com> Date: Wed May 22 18:34:42 2024 +0800 HBASE-28577 Remove deprecated methods in KeyValue (#5883) Co-authored-by: lixiaobao Co-authored-by: 李小保 Signed-off-by: Duo Zhang | |Upgrade failed.| commit 3b18ba664a6dcde344e13fe9305c272592195c03 Author: Nick Dimiduk Date: Wed May 22 10:05:54 2024 +0200 HBASE-28605 Add ErrorProne ban on Hadoop shaded thirdparty jars (#5918) This change results in this error on master at `3a3dd66e21`. ``` [WARNING] Rule 2: de.skuzzle.enforcer.restrictimports.rule.RestrictImports failed with message: Banned imports detected: Reason: Use shaded version in hbase-thirdparty| > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0 >Reporter: Ke Han >Priority: Major > Labels: upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionser
[jira] [Updated] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28812: --- Description: I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) {code:java} commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, upstream/branch-3) Author: Ray Mattingly Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load system entries until backup is complete (#6089) Co-authored-by: Ray Mattingly {code} However, the HMaster would crash during the upgrade process. h1. Reproduce Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) Step2: Stop the entire cluster Step3: Upgrade to 3.0.0 cluster. HMaster will crash with the following error message {code:java} 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] regionserver.HRegion: Failed initialize of region= master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back memstore java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7749) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:277) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:432) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:833) ~[?:?] Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 at org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:289) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:339) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:301) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6924) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1181) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1178) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
[jira] [Updated] (HBASE-28812) Upgrade from 2.6.0 to 3.0.0 crashed
[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28812: --- Labels: upgrade (was: ) > Upgrade from 2.6.0 to 3.0.0 crashed > --- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0 >Reporter: Ke Han >Priority: Major > Labels: upgrade > Attachments: hbase--master-2d6e4fad2af5.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly > {code} > h1. Reproduce > Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS), stop the entire cluster and then > start up the 3.0.0 cluster. HMaster will crash with the following error > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7749) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:277) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:432) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.lang.Thread.run(Thread.java:833) ~[?:?] > Caused by: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:289) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:339) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:301) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6924) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java
[jira] [Updated] (HBASE-28660) list_namespace not working after an incorrect user input
[ https://issues.apache.org/jira/browse/HBASE-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28660: --- Description: When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure handling. If user inputs an incorrect *list_namespace* command, hshell throws an exception. However, it has a side effect on the following *list_namespace* command: the result becomes empty (incorrect). h1. Reproduce Execute the following 2 commands can reproduce this bug * The first command is an incorrect list_namespace command, which causes and exception. * The second command is a correct list_namespace command, its return value is incorrect (empty). {code:java} list_namespace, 'ns.*' list_namespace{code} Here's the execution result The return result of the second command is incorrect. {code:java} hbase:002:0> list_namespace, 'ns.*' Traceback (most recent call last): SyntaxError ((hbase):2: syntax error, unexpected end-of-file) list_namespace, 'ns.*' ^ hbase:003:0> list_namespace hbase:004:0> {code} The expected output of list_namespace is {code:java} hbase:001:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.6820 seconds {code} h1. Root Cause This could be a bug in shell related to list_namespace. Restarting the shell restores normal functionality of the list_namespace command. was: When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure handling. If user inputs an incorrect *list_namespace* command, hshell throws an exception. However, it has a side effect on the following *list_namespace* command: the result becomes empty (incorrect). h1. Reproduce Execute the following 2 commands can reproduce this bug * The first command is an incorrect list_namespace command, which causes and exception. * The second command is a correct list_namespace command, its return value is incorrect (empty). {code:java} list_namespace, 'ns.*' list_namespace{code} Here's the execution result The return result of the second command is incorrect. {code:java} hbase:002:0> list_namespace, 'ns.*' Traceback (most recent call last): SyntaxError ((hbase):2: syntax error, unexpected end-of-file) list_namespace, 'ns.*' ^ hbase:003:0> list_namespace hbase:004:0> {code} The expected output of list_namespace is {code:java} hbase:001:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.6820 seconds {code} h1. Root Cause This could be a bug in shell related to list_namespace. Restart the shell would make the shell functional again. > list_namespace not working after an incorrect user input > > > Key: HBASE-28660 > URL: https://issues.apache.org/jira/browse/HBASE-28660 > Project: HBase > Issue Type: Bug >Affects Versions: 2.6.0, 3.0.0-beta-2 >Reporter: Ke Han >Priority: Major > > When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure > handling. > If user inputs an incorrect *list_namespace* command, hshell throws an > exception. However, it has a side effect on the following *list_namespace* > command: the result becomes empty (incorrect). > h1. Reproduce > Execute the following 2 commands can reproduce this bug > * The first command is an incorrect list_namespace command, which causes and > exception. > * The second command is a correct list_namespace command, its return value > is incorrect (empty). > {code:java} > list_namespace, 'ns.*' > list_namespace{code} > Here's the execution result > The return result of the second command is incorrect. > {code:java} > hbase:002:0> list_namespace, 'ns.*' > Traceback (most recent call last): > SyntaxError ((hbase):2: syntax error, unexpected end-of-file) > list_namespace, 'ns.*' > ^ > hbase:003:0> list_namespace > hbase:004:0> {code} > The expected output of list_namespace is > {code:java} > hbase:001:0> list_namespace > NAMESPACE > > default
[jira] [Updated] (HBASE-28660) list_namespace not working after an incorrect user input
[ https://issues.apache.org/jira/browse/HBASE-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28660: --- Summary: list_namespace not working after an incorrect user input (was: list_namespace command not working after an incorrect user input) > list_namespace not working after an incorrect user input > > > Key: HBASE-28660 > URL: https://issues.apache.org/jira/browse/HBASE-28660 > Project: HBase > Issue Type: Bug >Affects Versions: 2.6.0, 3.0.0-beta-2 >Reporter: Ke Han >Priority: Major > > When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure > handling. > If user inputs an incorrect *list_namespace* command, hshell throws an > exception. However, it has a side effect on the following *list_namespace* > command: the result becomes empty (incorrect). > h1. Reproduce > Execute the following 2 commands can reproduce this bug > * The first command is an incorrect list_namespace command, which causes and > exception. > * The second command is a correct list_namespace command, its return value > is incorrect (empty). > {code:java} > list_namespace, 'ns.*' > list_namespace{code} > Here's the execution result > The return result of the second command is incorrect. > {code:java} > hbase:002:0> list_namespace, 'ns.*' > Traceback (most recent call last): > SyntaxError ((hbase):2: syntax error, unexpected end-of-file) > list_namespace, 'ns.*' > ^ > hbase:003:0> list_namespace > hbase:004:0> {code} > The expected output of list_namespace is > {code:java} > hbase:001:0> list_namespace > NAMESPACE > > default > > hbase > > 2 row(s) > Took 0.6820 seconds {code} > h1. Root Cause > This could be a bug in shell related to list_namespace. Restart the shell > would make the shell functional again. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-28660) list_namespace command not working after an incorrect user input
[ https://issues.apache.org/jira/browse/HBASE-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28660: --- Description: When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure handling. If user inputs an incorrect *list_namespace* command, hshell throws an exception. However, it has a side effect on the following *list_namespace* command: the result becomes empty (incorrect). h1. Reproduce Execute the following 2 commands can reproduce this bug * The first command is an incorrect list_namespace command, which causes and exception. * The second command is a correct list_namespace command, its return value is incorrect (empty). {code:java} list_namespace, 'ns.*' list_namespace{code} Here's the execution result The return result of the second command is incorrect. {code:java} hbase:002:0> list_namespace, 'ns.*' Traceback (most recent call last): SyntaxError ((hbase):2: syntax error, unexpected end-of-file) list_namespace, 'ns.*' ^ hbase:003:0> list_namespace hbase:004:0> {code} The expected output of list_namespace is {code:java} hbase:001:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.6820 seconds {code} h1. Root Cause This could be a bug in shell related to list_namespace. Restart the shell would make the shell functional again. was: When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure handling. If user inputs an incorrect *list_namespace* command, hshell throws an exception. However, it has a side effect on the following *list_namespace* command: the result becomes empty (incorrect). h1. Reproduce Execute the following 2 commands can reproduce this bug * The first command is an incorrect list_namespace command, which causes and exception. * The second command is a correct list_namespace command, its return value is incorrect (empty). {code:java} list_namespace, 'ns.*' list_namespace{code} Here's the execution results The first command returns correct, however, the third command returns empty. {code:java} hbase:002:0> list_namespace, 'ns.*' Traceback (most recent call last): SyntaxError ((hbase):2: syntax error, unexpected end-of-file) list_namespace, 'ns.*' ^ hbase:003:0> list_namespace hbase:004:0> {code} The correct output of list_namespace is {code:java} hbase:001:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.6820 seconds {code} h1. Root Cause This could be a bug in shell related to list_namespace. Restart the shell would make the shell functional again. > list_namespace command not working after an incorrect user input > > > Key: HBASE-28660 > URL: https://issues.apache.org/jira/browse/HBASE-28660 > Project: HBase > Issue Type: Bug >Affects Versions: 2.6.0, 3.0.0-beta-2 >Reporter: Ke Han >Priority: Major > > When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure > handling. > If user inputs an incorrect *list_namespace* command, hshell throws an > exception. However, it has a side effect on the following *list_namespace* > command: the result becomes empty (incorrect). > h1. Reproduce > Execute the following 2 commands can reproduce this bug > * The first command is an incorrect list_namespace command, which causes and > exception. > * The second command is a correct list_namespace command, its return value > is incorrect (empty). > {code:java} > list_namespace, 'ns.*' > list_namespace{code} > Here's the execution result > The return result of the second command is incorrect. > {code:java} > hbase:002:0> list_namespace, 'ns.*' > Traceback (most recent call last): > SyntaxError ((hbase):2: syntax error, unexpected end-of-file) > list_namespace, 'ns.*' > ^ > hbase:003:0> list_namespace > hbase:004:0> {code} > The expected output of list_namespace is > {code:java} > hbase:001:0> list_namespace > NAMESPACE > > default
[jira] [Created] (HBASE-28660) list_namespace command not working after an incorrect user input
Ke Han created HBASE-28660: -- Summary: list_namespace command not working after an incorrect user input Key: HBASE-28660 URL: https://issues.apache.org/jira/browse/HBASE-28660 Project: HBase Issue Type: Bug Affects Versions: 2.6.0, 3.0.0-beta-2 Reporter: Ke Han When using hbase-3.0.0 or 2.6.0, there's a shell bug related to failure handling. If user inputs an incorrect *list_namespace* command, hshell throws an exception. However, it has a side effect on the following *list_namespace* command: the result becomes empty (incorrect). h1. Reproduce Execute the following 2 commands can reproduce this bug * The first command is an incorrect list_namespace command, which causes and exception. * The second command is a correct list_namespace command, its return value is incorrect (empty). {code:java} list_namespace, 'ns.*' list_namespace{code} Here's the execution results The first command returns correct, however, the third command returns empty. {code:java} hbase:002:0> list_namespace, 'ns.*' Traceback (most recent call last): SyntaxError ((hbase):2: syntax error, unexpected end-of-file) list_namespace, 'ns.*' ^ hbase:003:0> list_namespace hbase:004:0> {code} The correct output of list_namespace is {code:java} hbase:001:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.6820 seconds {code} h1. Root Cause This could be a bug in shell related to list_namespace. Restart the shell would make the shell functional again. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28659) NPE in hmaster (setServerState function)
Ke Han created HBASE-28659: -- Summary: NPE in hmaster (setServerState function) Key: HBASE-28659 URL: https://issues.apache.org/jira/browse/HBASE-28659 Project: HBase Issue Type: Bug Components: master Affects Versions: 3.0.0-beta-2 Reporter: Ke Han Attachments: hbase--master-d16bb50815b7.log I met NPE in master node after migrating data from 2.5.8. {code:java} [ERROR LOG] executionId = LrmpjV32 ConfigIdx = test9 Node02024-05-11T10:45:57,896 ERROR [PEWorker-15] procedure2.ProcedureExecutor: CODE-BUG: Uncaught runtime exception: pid=48, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure hregion1,16020,1715424228375, splitWal=true, meta=true java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.assignment.RegionStates.setServerState(RegionStates.java:409) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.assignment.RegionStates.logSplitting(RegionStates.java:435) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.executeFromState(ServerCrashProcedure.java:226) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T10:45:57,918 ERROR [PEWorker-15] procedure2.ProcedureExecutor: Root Procedure pid=48, state=FAILED:SERVER_CRASH_SPLIT_LOGS, hasLock=true, exception=java.lang.NullPointerException via CODE-BUG: Uncaught runtime exception: pid=48, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure hregion1,16020,1715424228375, splitWal=true, meta=true:java.lang.NullPointerException; ServerCrashProcedure hregion1,16020,1715424228375, splitWal=true, meta=true does not support rollback but the execution failed and try to rollback, code bug? org.apache.hadoop.hbase.procedure2.RemoteProcedureException: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1826) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1484) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] {code} h1. Reproduce This bug cannot be reproduced deterministically. I upgraded hbase cluster form 2.5.8 to 3.0.0 (commit: 516c89e8597fb) with 4 nodes (1HM, 2RS, 1HDFS) h1. Root Case >From the stack trace, the bug is in setServerState state function. {code:java} hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java private void setServerState(ServerName serverName, ServerState state) { ServerStateNode serverNode = getServerNode(serverName); synchronized (serverNode) { // NPE! serverNode.setState(state); } } {code} The serverNode sometimes could be null and there's no null pointer check. {code:java} /** Returns Pertinent ServerStateNode or NULL if none found (Do not make modifications). */ public ServerStateNode getServerNode(final ServerName serverName) { return serverMap.get(serverName); } {code} The potential fix could be adding a NULL serverNode. However, how it runs into this buggy state is unclear. I am running the workloads that could trigger the bug multiple times to see if I can find more information. I have attached the error log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28590) NPE after upgrade from 2.5.8 to 3.0.0
Ke Han created HBASE-28590: -- Summary: NPE after upgrade from 2.5.8 to 3.0.0 Key: HBASE-28590 URL: https://issues.apache.org/jira/browse/HBASE-28590 Project: HBase Issue Type: Bug Components: master Affects Versions: 3.0.0 Reporter: Ke Han Attachments: commands.txt, hbase--master-fc906f1808de.log, persistent.tar.gz When upgrade hbase cluster from 2.5.8 to 3.0.0 (commit: 516c89e8597fb6), I met the following NPE in master log. {code:java} 2024-05-11T02:17:47,293 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,326 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,337 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]{code} h1. Reproduce This bug cannot be reproduced deterministically but it happens pretty frequently (10% to trigger with the following steps. 1. Start up 2.5.8 cluster with default configuration (1 HM, 2 RS, 1 HDFS) 2. Execute the commands in commands.txt 3. Stop the 2.5.8 cluster and upgrade to 3.0.0 cluster with default configuration (commit: 516c89e8597fb6, 1 HM, 2 RS, 1 HDFS) The error message will occur in master log. I attached (1) commands to reproduce it (2) master log and (3) full error logs of all nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-28590) NPE after upgrade from 2.5.8 to 3.0.0
[ https://issues.apache.org/jira/browse/HBASE-28590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28590: --- Description: When upgrade hbase cluster from 2.5.8 to 3.0.0 (commit: 516c89e8597fb6), I met the following NPE in master log. {code:java} 2024-05-11T02:17:47,293 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,326 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,337 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]{code} h1. Reproduce This bug cannot be reproduced deterministically but it happens pretty frequently (10% to trigger with the following steps. 1. Start up 2.5.8 cluster with default configuration (1 HM, 2 RS, 1 HDFS) 2. Execute the commands in commands.txt 3. Stop the 2.5.8 cluster and upgrade to 3.0.0 cluster with default configuration (commit: 516c89e8597fb6, 1 HM, 2 RS, 1 HDFS) The error message will occur in master log. I attached (1) commands to reproduce it (2) master log and (3) full error logs of all nodes. was: When upgrade hbase cluster from 2.5.8 to 3.0.0 (commit: 516c89e8597fb6), I met the following NPE in master log. {code:java} 2024-05-11T02:17:47,293 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Summary: Upgrade from 2.5.8 to 3.0.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema (was: Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema) > Upgrade from 2.5.8 to 3.0.0 crash with InvalidProtocolBufferException: > Message missing required fields: old_table_schema > > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-033a47be7d1d.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] >
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: hbase--master-033a47be7d1d.log persistent.tar.gz > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-033a47be7d1d.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: hbase--master-cc13b0df0f3a.log) > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHO
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: persistent.tar.gz) > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-b
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: commands.txt) > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-cc13b0df0f3a.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(T
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: commands.txt > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: commands.txt, hbase--master-cc13b0df0f3a.log, > persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnab
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: commands_700.txt) > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-cc13b0df0f3a.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-09T20:16:20,639 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715285771112: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: (was: commands.txt) > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: commands_700.txt, hbase--master-cc13b0df0f3a.log, > persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lamb
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Attachment: commands_700.txt > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > -- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: commands_700.txt, hbase--master-cc13b0df0f3a.log, > persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$trac
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-09T20:16:20,639 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715285771112: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Updated] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28583: --- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-09T20:16:20,639 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715285771112: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractPar
[jira] [Created] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema
Ke Han created HBASE-28583: -- Summary: Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema Key: HBASE-28583 URL: https://issues.apache.org/jira/browse/HBASE-28583 Project: HBase Issue Type: Bug Components: master Affects Versions: 2.5.8, 3.0.0 Reporter: Ke Han Attachments: commands.txt, hbase--master-cc13b0df0f3a.log, persistent.tar.gz When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-09T20:16:20,639 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1715285771112: Unhandled exception. Starting shutdown. * org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing requi
[jira] [Resolved] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han resolved HBASE-28519. Fix Version/s: 3.0.0-beta-2 Resolution: Fixed > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Critical > Fix For: 3.0.0-beta-2 > > Attachments: all_logs.tar.gz, hbase--master-64f850a4e287.log > > > h1. Reproduce > Step1: Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS > (hadoop-2.10.2). > Step2: Perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, > 1 HDFS (hadoop-2.10.2). +(No command is needed before the upgrade)+ > HMaster aborts with the following exception > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.
[jira] [Commented] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836827#comment-17836827 ] Ke Han commented on HBASE-28519: [~zhangduo] Thank you for the reply! I'll try upgrading to hbase with HBASE-28376. > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Critical > Attachments: all_logs.tar.gz, hbase--master-64f850a4e287.log > > > h1. Reproduce > Step1: Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS > (hadoop-2.10.2). > Step2: Perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, > 1 HDFS (hadoop-2.10.2). +(No command is needed before the upgrade)+ > HMaster aborts with the following exception > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Description: h1. Reproduce Step1: Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). Step2: Perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). +(No command is needed before the upgrade)+ HMaster aborts with the following exception {code:java} 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] ... 5 more 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled exception. Starting shutdown. * java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] a
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Description: h1. Reproduce Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). Then directly perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). (No command is needed before the upgrade) HMaster aborts with the following exception: {code:java} 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] ... 5 more 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled exception. Starting shutdown. * java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Priority: Critical (was: Major) > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Critical > Attachments: all_logs.tar.gz, hbase--master-64f850a4e287.log > > > h1. Reproduce > Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > Execute one read command: LIST, then perform a full-stop upgrade to > HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > HMaster aborts with the following exception: > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apac
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Component/s: master > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: all_logs.tar.gz, hbase--master-64f850a4e287.log > > > h1. Reproduce > Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > Execute one read command: LIST, then perform a full-stop upgrade to > HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > HMaster aborts with the following exception: > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Attachment: all_logs.tar.gz > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: all_logs.tar.gz, hbase--master-64f850a4e287.log > > > h1. Reproduce > Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > Execute one read command: LIST, then perform a full-stop upgrade to > HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > HMaster aborts with the following exception: > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClu
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Description: h1. Reproduce Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). Execute one read command: LIST, then perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). HMaster aborts with the following exception: {code:java} 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] ... 5 more 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled exception. Starting shutdown. * java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at java.lang.Thread.ru
[jira] [Updated] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
[ https://issues.apache.org/jira/browse/HBASE-28519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28519: --- Attachment: hbase--master-64f850a4e287.log > HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 > --- > > Key: HBASE-28519 > URL: https://issues.apache.org/jira/browse/HBASE-28519 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-64f850a4e287.log > > > h1. Reproduce > Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > Execute one read command: LIST, then perform a full-stop upgrade to > HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). > HMaster aborts with the following exception: > {code:java} > 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 > actions: RetriesExhaustedException: 2 times, servers with issues: > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) > ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) > ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] > ... 5 more > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] > 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled > exception. Starting shutdown. * > java.lang.IllegalStateException: Expected the service > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.master.HMaster.initClust
[jira] [Created] (HBASE-28519) HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1
Ke Han created HBASE-28519: -- Summary: HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1 Key: HBASE-28519 URL: https://issues.apache.org/jira/browse/HBASE-28519 Project: HBase Issue Type: Bug Affects Versions: 2.5.8, 3.0.0-beta-1, 3.0.0-alpha-4, 3.0.0-alpha-1 Reporter: Ke Han h1. Reproduce Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). * HDFS is running Hadoop-2.10.2 Execute one read command: LIST, then perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). HMaster will abort with the following exception {code:java} 2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] ... 5 more 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] 2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1712980015693: Unhandled exception. Starting shutdown. * java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1] at org.apache.hadoop.hbase.master.HMaster.la
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Description: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table', 'row1', 'cf1:cell', 2 flush 'table', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds {code} According to the _flush (flush.rb)_ command specification, user can flush a specific column family. {code:java} Flush all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. was: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table', 'row1', 'cf1:cell', 2 flush 'table', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Affects Version/s: 2.5.5 2.4.17 > NPE when flushing a non-existing column family > -- > > Key: HBASE-28187 > URL: https://issues.apache.org/jira/browse/HBASE-28187 > Project: HBase > Issue Type: Bug >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > > Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in > both shell and the HMaster logs. > h1. Reproduce > Start up HBase 2.5.5 cluster, executing the following commands with hbase > shell in HMaster node will lead to NPE. (Can be reproduced determinstically) > {code:java} > create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > incr 'table', 'row1', 'cf1:cell', 2 > flush 'table', 'cf3'{code} > The shell outputs > {code:java} > hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => > 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > Created table table > Took 2.1238 seconds > > => Hbase::Table - table > hbase:007:0> > hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 > COUNTER VALUE = 2 > Took 0.0131 seconds > > hbase:009:0> > hbase:010:0> flush 'table', 'cf3' > ERROR: java.io.IOException: java.lang.NullPointerException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > Caused by: > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > For usage try 'help "flush"' > Took 12.1713 seconds > {code} > > According to the _flush (flush.rb)_ command specification, user can flush a > specific column family. > {code:java} > Flush all regions in passed table or pass a region row to > flush an individual region or a region server name whose format > is 'host,port,startcode', to flush all its regions. > You can also flush a single column family for all regions within a table, > or for an specific region only. > For example: > hbase> flush 'TABLENAME' > hbase> flush 'TABLENAME','FAMILYNAME' {code} > In the above case, *cf3* an incorrect input (non-existing column family). If > user tries to flush it, the expected output is: > # HBase rejects this operation > # returns a prompt saying the column family doesn't exist > {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} > h1. Root Cause > There's a missing check for the whether the target flushing columnfamily > exists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Description: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table', 'row1', 'cf1:cell', 2 flush 'table', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds {code} According to the _flush (flush.rb)_ command specification, user can flush a specific column family. {code:java} sh all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. was: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Description: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds {code} According to the _flush (flush.rb)_ command specification, user can flush a specific column family. {code:java} sh all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}{_}{+}ERROR: Unknown CF...{+}".{_} h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. was: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.ca
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Description: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. (Can be reproduced determinstically) {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds {code} According to the _flush (flush.rb)_ command specification, user can flush a specific column family. {code:java} sh all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}{_}ERROR: Unknown CF...".{_} h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. was: Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.
[jira] [Updated] (HBASE-28187) NPE when flushing a non-existing column family
[ https://issues.apache.org/jira/browse/HBASE-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28187: --- Summary: NPE when flushing a non-existing column family (was: NPE when flush a non-existing column family) > NPE when flushing a non-existing column family > -- > > Key: HBASE-28187 > URL: https://issues.apache.org/jira/browse/HBASE-28187 > Project: HBase > Issue Type: Bug >Reporter: Ke Han >Priority: Major > > Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in > both shell and the HMaster logs. > h1. Reproduce > Start up HBase 2.5.5 cluster, executing the following commands with hbase > shell in HMaster node will lead to NPE. > > {code:java} > create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > incr 'table7', 'row1', 'cf1:cell', 2 > flush 'table7', 'cf3'{code} > > > The shell outputs > > {code:java} > hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => > 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => > 'NONE', BLOOMFILTER => 'ROWCOL'} > Created table table > Took 2.1238 seconds > > => Hbase::Table - table > hbase:007:0> > hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 > COUNTER VALUE = 2 > Took 0.0131 seconds > > hbase:009:0> > hbase:010:0> flush 'table', 'cf3' > ERROR: java.io.IOException: java.lang.NullPointerException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) > at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > Caused by: > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) > at > org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) > at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > For usage try 'help "flush"' > Took 12.1713 seconds > > > hbase:011:0> {code} > > > According to the flush command specification, user could flush a specific > column family. > > {code:java} > sh all regions in passed table or pass a region row to > flush an individual region or a region server name whose format > is 'host,port,startcode', to flush all its regions. > You can also flush a single column family for all regions within a table, > or for an specific region only. > For example: > hbase> flush 'TABLENAME' > hbase> flush 'TABLENAME','FAMILYNAME' {code} > > In the above case, *cf3* an incorrect input (non-existing column family). If > user tries to flush it, the expected output is: > # HBase rejects this operation > # returns a prompt saying the column family doesn't exist {_}"{_}_ERROR: > Unknown CF..."._ > > It can be reproduced deterministically with the above commands. > h1. Root Cause > There's a missing check for the whether the target flushing columnfamily > exists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28187) NPE when flush a non-existing column family
Ke Han created HBASE-28187: -- Summary: NPE when flush a non-existing column family Key: HBASE-28187 URL: https://issues.apache.org/jira/browse/HBASE-28187 Project: HBase Issue Type: Bug Reporter: Ke Han Flush a columnfamily that doesn't exist in the table will cause NPE ERROR in both shell and the HMaster logs. h1. Reproduce Start up HBase 2.5.5 cluster, executing the following commands with hbase shell in HMaster node will lead to NPE. {code:java} create 'table7', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} incr 'table7', 'row1', 'cf1:cell', 2 flush 'table7', 'cf3'{code} The shell outputs {code:java} hbase:006:0> create 'table', {NAME => 'cf1', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL'}, {NAME => 'cf2', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL'} Created table table Took 2.1238 seconds => Hbase::Table - table hbase:007:0> hbase:008:0> incr 'table', 'row1', 'cf1:cell', 2 COUNTER VALUE = 2 Took 0.0131 seconds hbase:009:0> hbase:010:0> flush 'table', 'cf3' ERROR: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:479) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager$FlushTableSubprocedurePool.waitForOutstandingTasks(RegionServerFlushTableProcedureManager.java:274) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.flushRegions(FlushTableSubprocedure.java:115) at org.apache.hadoop.hbase.procedure.flush.FlushTableSubprocedure.acquireBarrier(FlushTableSubprocedure.java:126) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:160) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) For usage try 'help "flush"' Took 12.1713 seconds hbase:011:0> {code} According to the flush command specification, user could flush a specific column family. {code:java} sh all regions in passed table or pass a region row to flush an individual region or a region server name whose format is 'host,port,startcode', to flush all its regions. You can also flush a single column family for all regions within a table, or for an specific region only. For example: hbase> flush 'TABLENAME' hbase> flush 'TABLENAME','FAMILYNAME' {code} In the above case, *cf3* an incorrect input (non-existing column family). If user tries to flush it, the expected output is: # HBase rejects this operation # returns a prompt saying the column family doesn't exist {_}"{_}_ERROR: Unknown CF..."._ It can be reproduced deterministically with the above commands. h1. Root Cause There's a missing check for the whether the target flushing columnfamily exists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28167) HMaster crashes due to NPE at AsyncFSWAL.closeWriter
[ https://issues.apache.org/jira/browse/HBASE-28167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1272#comment-1272 ] Ke Han commented on HBASE-28167: Thanks for the reply! [~subrat.mishra] Unfortunately I didn't record the NN and DN logs properly in the previous buggy run. I'll try to reproduce it again with hdfs logging to provide more information. > HMaster crashes due to NPE at AsyncFSWAL.closeWriter > > > Key: HBASE-28167 > URL: https://issues.apache.org/jira/browse/HBASE-28167 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.4.17 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-a82083cf5d18.log, persistent.tar.gz > > > I am testing the upgrade process of HMaster, when starting up the new version > HMaster 2.4.17, it crashed immediately with the following exception. > {code:java} > 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled > exception. Starting shutdown. * > org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed > close after init wal failed. > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) > at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) > at > org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) > at > org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) > ... 10 more > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) > at > java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > h1. Reproduce > It happens non-deterministically, around 2 out of 1802 tests. It might > require an exception when HMaster interacts with the HDFS cluster since I > noticed the following warning before the NPE exception > {code:java} > 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] > asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > failed, retry = 0 > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 1 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2586) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServ
[jira] [Comment Edited] (HBASE-28167) HMaster crashes due to NPE at AsyncFSWAL.closeWriter
[ https://issues.apache.org/jira/browse/HBASE-28167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1272#comment-1272 ] Ke Han edited comment on HBASE-28167 at 10/19/23 1:30 PM: -- Thanks for the reply! [~subrat.mishra] Unfortunately I didn't record the NN and DN logs properly in the previous buggy run. I'll try to reproduce it with hdfs logging to provide more information. was (Author: JIRAUSER289562): Thanks for the reply! [~subrat.mishra] Unfortunately I didn't record the NN and DN logs properly in the previous buggy run. I'll try to reproduce it again with hdfs logging to provide more information. > HMaster crashes due to NPE at AsyncFSWAL.closeWriter > > > Key: HBASE-28167 > URL: https://issues.apache.org/jira/browse/HBASE-28167 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.4.17 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-a82083cf5d18.log, persistent.tar.gz > > > I am testing the upgrade process of HMaster, when starting up the new version > HMaster 2.4.17, it crashed immediately with the following exception. > {code:java} > 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled > exception. Starting shutdown. * > org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed > close after init wal failed. > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) > at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) > at > org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) > at > org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) > ... 10 more > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) > at > java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > h1. Reproduce > It happens non-deterministically, around 2 out of 1802 tests. It might > require an exception when HMaster interacts with the HDFS cluster since I > noticed the following warning before the NPE exception > {code:java} > 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] > asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > failed, retry = 0 > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 1 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) > at > org.ap
[jira] [Updated] (HBASE-28167) HMaster crashes due to NPE at AsyncFSWAL.closeWriter
[ https://issues.apache.org/jira/browse/HBASE-28167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28167: --- Description: I am testing the upgrade process of HMaster, when starting up the new version HMaster 2.4.17, it crashed immediately with the following exception. {code:java} 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled exception. Starting shutdown. * org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed close after init wal failed. at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) at org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) ... 10 more Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more {code} h1. Reproduce It happens non-deterministically, around 2 out of 1802 tests. It might require an exception when HMaster interacts with the HDFS cluster since I noticed the following warning before the NPE exception {code:java} 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 failed, retry = 0 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2586) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:889) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:517) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:498) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1038) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1003) {code} This remote excpetion might lead to a NULL writer and it further causes NPE to crash the HMaster. h1. Root Cause When invoking *inflightWALClosures.put
[jira] [Updated] (HBASE-28167) HMaster crashes due to NPE at AsyncFSWAL.closeWriter
[ https://issues.apache.org/jira/browse/HBASE-28167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28167: --- Description: I am testing the upgrade process of HMaster, when starting up the new version HMaster 2.4.17, it crashed immediately with the following exception. {code:java} 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled exception. Starting shutdown. * org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed close after init wal failed. at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) at org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) ... 10 more Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more {code} h1. Reproduce It happens non-deterministically, around (2 out of 1802 tests). It might require a exception when HMaster interacts with the HDFS cluster since I noticed the following warning before the NPE exception {code:java} 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 failed, retry = 0 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2586) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:889) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:517) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:498) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1038) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1003) {code} This remote excpetion might lead to a NULL writer and it further causes NPE to crash athe HMaster. h1. Root Cause When invoking *inflightWALClosures.p
[jira] [Updated] (HBASE-28167) HMaster crashes due to NPE at AsyncFSWAL.closeWriter
[ https://issues.apache.org/jira/browse/HBASE-28167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28167: --- Summary: HMaster crashes due to NPE at AsyncFSWAL.closeWriter (was: HMaster crash due to NPE) > HMaster crashes due to NPE at AsyncFSWAL.closeWriter > > > Key: HBASE-28167 > URL: https://issues.apache.org/jira/browse/HBASE-28167 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.4.17 >Reporter: Ke Han >Priority: Major > Attachments: hbase--master-a82083cf5d18.log, persistent.tar.gz > > > I am testing the upgrade process of HMaster, when starting up the new version > HMaster 2.4.17, it crashed immediately with the following exception. > {code:java} > 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled > exception. Starting shutdown. * > org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed > close after init wal failed. > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) > at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) > at > org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) > at > org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) > ... 10 more > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) > at > java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > h1. Reproduce > It happens non-deterministically, around (2 out of 1802 tests). It might > require a exception when HMaster interacts with the HDFS cluster since I > noticed the following warning before the NPE exception > > {code:java} > 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] > asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > failed, retry = 0 > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 1 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2586) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:889) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenode
[jira] [Created] (HBASE-28167) HMaster crash due to NPE
Ke Han created HBASE-28167: -- Summary: HMaster crash due to NPE Key: HBASE-28167 URL: https://issues.apache.org/jira/browse/HBASE-28167 Project: HBase Issue Type: Bug Components: master Affects Versions: 2.4.17 Reporter: Ke Han Attachments: hbase--master-a82083cf5d18.log, persistent.tar.gz I am testing the upgrade process of HMaster, when starting up the new version HMaster 2.4.17, it crashed immediately with the following exception. {code:java} 2023-10-17 21:03:35,892 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: * ABORTING master hmaster,16000,1697576576301: Unhandled exception. Starting shutdown. * org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed close after init wal failed. at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62) at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295) at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:200) at org.apache.hadoop.hbase.master.region.MasterRegion.bootstrap(MasterRegion.java:220) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:348) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:855) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:979) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1006) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165) ... 10 more Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:743) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:800) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:951) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:946) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more {code} h1. Reproduce It happens non-deterministically, around (2 out of 1802 tests). It might require a exception when HMaster interacts with the HDFS cluster since I noticed the following warning before the NPE exception {code:java} 2023-10-17 21:03:35,857 WARN [master/hmaster:16000:becomeActiveMaster] asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create fan-out dfs output /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 failed, retry = 0 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/MasterData/WALs/hmaster,16000,1697576576301/hmaster%2C16000%2C1697576576301.1697576615700 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1832) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2586) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:889) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:517) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:498) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1038)
[jira] [Updated] (HBASE-28159) Unable to get table state error when table is being initialized
[ https://issues.apache.org/jira/browse/HBASE-28159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28159: --- Description: When executing commands to create a table, I noticed the following ERROR in HMaster {code:java} 2023-10-17 06:41:47,118 ERROR [master/hmaster:16000.Chore.1] master.TableStateManager: Unable to get table uuidf68fb89ec7f4435597d69fb7b099d8e7 state org.apache.hadoop.hbase.TableNotFoundException: No state found for uuidf68fb89ec7f4435597d69fb7b099d8e7 at org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:155) at org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:92) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:419) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.getRegionStatesCount(AssignmentManager.java:2341) at org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2616) at org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2537) at org.apache.hadoop.hbase.master.balancer.ClusterStatusChore.chore(ClusterStatusChore.java:47) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Reproduce Due to the thread interleaving, it might need to run the following command sequence multiple times to reproduce 1 HM, 2 RS, HDFS 2.10.2 cluster {code:java} create 'uuid49bb410e0a0c40ffb070d17787b4cad7', {NAME => 'uuid66e57e5195e04956a78f789b2a25ec01', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid119181eed72a43ccb66fabe37f84d2c0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuidc2d4931eaf4c429db0e55514fb12e767', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuidc9802bbfbe434411ae68bb8388d499b6', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuidc85e117d0ca144719fc53d30b189a343', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'} create 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'uuid76ccbd96fbdc418b95ed9971ff423b2d', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid36835d3faff04838bd02d6226557d7c8', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid37752598d1bb405eb39a3e17c04d7e60', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'} create 'uuidf68fb89ec7f4435597d69fb7b099d8e7', {NAME => 'uuidb235288b1d304fe1a62adb63968d9eee', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidf348f8849e724b3fa231fc2bb459be2d', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid81341a87083e49d7a0d8aff7b1ccf16a', VERSIONS => 3, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuid24db0d3c67c347d3a4c18af90facec2d', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid7ecf10315f444cfd9c5698695f9054d9', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} enable 'uuid094dd5bf47eb47d69148b63e73ce0e7c' create_namespace 'uuidc1066f82d7834f698d335dd04fa7ad3e' alter 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'enaJvIGYBk', BLOOMFILTER => 'ROWCOL', IN_MEMORY => false} disable 'uuidf68fb89ec7f4435597d69fb7b099d8e7' {code} I have attached the full logs. h1. Root Cause The ERROR message is thrown because of the thread interleaving between (1) T1: creating the table and (2) T2: Chore thread calculating TABLE_TO_REGIONS_COUNT. Here's how it happens in detail # User issues a create table request, it puts the table name into tableDescriptors. # Chore thread is trying to calculate TABLE_TO_REGIONS_COUNT by iterating all tables from {*}getTableDescriptors().getAll(){*}. This also in
[jira] [Created] (HBASE-28159) Unable to get table state error when table is being initialized
Ke Han created HBASE-28159: -- Summary: Unable to get table state error when table is being initialized Key: HBASE-28159 URL: https://issues.apache.org/jira/browse/HBASE-28159 Project: HBase Issue Type: Bug Components: master Affects Versions: 2.4.17 Reporter: Ke Han Attachments: hbase--master-37bbb9b6f05a.log, persistent.tar.gz When executing commands to create a table, I noticed the following ERROR in HMaster {code:java} 2023-10-17 06:41:47,118 ERROR [master/hmaster:16000.Chore.1] master.TableStateManager: Unable to get table uuidf68fb89ec7f4435597d69fb7b099d8e7 state org.apache.hadoop.hbase.TableNotFoundException: No state found for uuidf68fb89ec7f4435597d69fb7b099d8e7 at org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:155) at org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:92) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:419) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.getRegionStatesCount(AssignmentManager.java:2341) at org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2616) at org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2537) at org.apache.hadoop.hbase.master.balancer.ClusterStatusChore.chore(ClusterStatusChore.java:47) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Reproduce Due to the thread interleaving, it might need to run the following command sequence multiple times to reproduce 1 HM, 2 RS, HDFS-2.10.2 {code:java} create 'uuid49bb410e0a0c40ffb070d17787b4cad7', {NAME => 'uuid66e57e5195e04956a78f789b2a25ec01', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid119181eed72a43ccb66fabe37f84d2c0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuidc2d4931eaf4c429db0e55514fb12e767', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuidc9802bbfbe434411ae68bb8388d499b6', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuidc85e117d0ca144719fc53d30b189a343', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'} create 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'uuid76ccbd96fbdc418b95ed9971ff423b2d', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid36835d3faff04838bd02d6226557d7c8', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid37752598d1bb405eb39a3e17c04d7e60', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'} create 'uuidf68fb89ec7f4435597d69fb7b099d8e7', {NAME => 'uuidb235288b1d304fe1a62adb63968d9eee', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidf348f8849e724b3fa231fc2bb459be2d', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid81341a87083e49d7a0d8aff7b1ccf16a', VERSIONS => 3, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuid24db0d3c67c347d3a4c18af90facec2d', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid7ecf10315f444cfd9c5698695f9054d9', VERSIONS => 1, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} enable 'uuid094dd5bf47eb47d69148b63e73ce0e7c' create_namespace 'uuidc1066f82d7834f698d335dd04fa7ad3e' alter 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'enaJvIGYBk', BLOOMFILTER => 'ROWCOL', IN_MEMORY => false} disable 'uuidf68fb89ec7f4435597d69fb7b099d8e7' {code} I have attached the full logs. h1. Root Cause The ERROR message is thrown because of the thread interleaving between (1) T1: creating the table and (2) T2: Chore thread calculating TABLE_TO_REGIONS_COUNT. Here'
[jira] [Updated] (HBASE-28125) list_quota_table_sizes returns a table whose quota is not set
[ https://issues.apache.org/jira/browse/HBASE-28125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28125: --- Description: When using HBase cluster 2.4.17, I noticed that the list_quota_table_sizes sometimes could return a table whose quota is not set. h1. Reproduce {_}This bug cannot be reproduced deterministically{_}. The probability for its manifestation is 1.1% (Keep executing the commands, and it occurs two times out of 178 repeatedly executions). Start up HBase 2.4.17 cluster (2.10.2 HDFS, 1 HMaster, 2 RS) Executing the following commands, {code:java} create_namespace 'uuidef8e6005b9e74092927a4b335424f7c5' create 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuid07f904a09baf414d903b2818d3091f28', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid085c0991aa1c4d4ea145767e7e7bf60c', VERSIONS => 4, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid1d9a2bc405c64708b9e471ae14794741', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} create 'uuid5cafa12ce5034015bb597428b294a40d', {NAME => 'uuid7d9efb39ac94472b90dc60ed3723cdf9', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuidde273134b6434fc584990554cfa64b10', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'} clone_table_schema 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuid5cafa12ce5034015bb597428b294a40d' truncate 'uuidb8e2393af3314726890b70ef5871a9d0' compaction_state 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' truncate_preserve 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuidb8e2393af3314726890b70ef5871a9d0' truncate 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' alter 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', METHOD => 'delete'} disable 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' incr 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuid863efa8e4f1f44888af0ed139effba33', 'uuid085c0991aa1c4d4ea145767e7e7bf60c:NONE', 3 drop 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' wal_roll 'hregion2,16020' wal_roll 'hregion1,16020' create 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuidd64032ff2e7340fb8832a16430fc14c1', VERSIONS => 3, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidc4be1958501543ac86661793a4c144cb', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid0c7961e7f67a464387f9f5ce428f08d1', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} major_compact 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuid0c7961e7f67a464387f9f5ce428f08d1' incr 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuidffb29730a2fb4f5f8c18f0c1bc254402', 'uuida9f7a62b76fa4560834cc5789d5abf3a:cc', 3 alter 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', METHOD => 'delete'} update_config 'hregion1,16020' {code} Then execute read command in hbase shell {code:java} list_quota_table_sizes TABLE SIZE uuid4323f716aea24b5fa001f0722cdc66f9 5133 1 row(s) Took 0.0278 seconds {code} uuid4323f716aea24b5fa001f0722cdc66f9 is not set in the quota table but it still appears in in list_quota_table_sizes results also with a strange value: 5133. h1. Thoughts The root cause might be related to how *quota* table reacts when a new table is created. I am still investigating the root cause (Injecting logs to MasterQuotaManager to understand why it also records this table). was: When using HBase cluster 2.4.17, I noticed that the list_quota_table_sizes sometimes could return a table whose quota is not set. h1. Reproduce {_}This bug cannot be reproduced deterministically{_}. The probability for its manifestation is 1.1% (Keep executing the commands, and it occurs two times out of 178 repeatedly executions). Start up HBase 2.4.17 cluster (2.10.2 HDFS, 1 HMaster, 2 RS) Executing the following commands, {code:java} create_namespace 'uuidef8e6005b9e74092927a4b335424f7c5' create 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuid07f904a09baf414d903b2818d3091f28', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid085c0991aa1c4d4ea145767e7e7bf60c', VERSIONS => 4, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid1d9a2bc405c64708b9e471ae14794741', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'f
[jira] [Updated] (HBASE-28125) list_quota_table_sizes returns a table whose quota is not set
[ https://issues.apache.org/jira/browse/HBASE-28125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28125: --- Description: When using HBase cluster 2.4.17, I noticed that the list_quota_table_sizes sometimes could return a table whose quota is not set. h1. Reproduce {_}This bug cannot be reproduced deterministically{_}. The probability for its manifestation is 1.1% (Keep executing the commands, and it occurs two times out of 178 repeatedly executions). Start up HBase 2.4.17 cluster (2.10.2 HDFS, 1 HMaster, 2 RS) Executing the following commands, {code:java} create_namespace 'uuidef8e6005b9e74092927a4b335424f7c5' create 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuid07f904a09baf414d903b2818d3091f28', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid085c0991aa1c4d4ea145767e7e7bf60c', VERSIONS => 4, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid1d9a2bc405c64708b9e471ae14794741', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} create 'uuid5cafa12ce5034015bb597428b294a40d', {NAME => 'uuid7d9efb39ac94472b90dc60ed3723cdf9', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuidde273134b6434fc584990554cfa64b10', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'} clone_table_schema 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuid5cafa12ce5034015bb597428b294a40d' truncate 'uuidb8e2393af3314726890b70ef5871a9d0' compaction_state 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' truncate_preserve 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuidb8e2393af3314726890b70ef5871a9d0' truncate 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' alter 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', METHOD => 'delete'} disable 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' incr 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuid863efa8e4f1f44888af0ed139effba33', 'uuid085c0991aa1c4d4ea145767e7e7bf60c:NONE', 3 drop 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' wal_roll 'hregion2,16020' wal_roll 'hregion1,16020' create 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuidd64032ff2e7340fb8832a16430fc14c1', VERSIONS => 3, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidc4be1958501543ac86661793a4c144cb', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid0c7961e7f67a464387f9f5ce428f08d1', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} major_compact 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuid0c7961e7f67a464387f9f5ce428f08d1' incr 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuidffb29730a2fb4f5f8c18f0c1bc254402', 'uuida9f7a62b76fa4560834cc5789d5abf3a:cc', 3 alter 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', METHOD => 'delete'} update_config 'hregion1,16020' {code} Then execute read command in hbase shell {code:java} list_quota_table_sizes TABLE SIZE uuid4323f716aea24b5fa001f0722cdc66f9 5133 1 row(s) Took 0.0278 seconds {code} uuid4323f716aea24b5fa001f0722cdc66f9 is not set in the quota table but it still appears in in list_quota_table_sizes results also with a strange value: 5133. h1. Thoughts The root cause might be related to how *quota* table reacts when a new table is created. I am still investigating the root cause (Injecting logs to MasterQuotaManager to understand why it also records this table). Is this normal behavior for list_quota_table_sizes? If this is abnormal behavior, I can give it a try for fixing. was: When using HBase cluster 2.4.17, I noticed that the list_quota_table_sizes sometimes could return a table whose quota is not set. h1. Reproduce {_}This bug cannot be reproduced deterministically{_}. The probability for its manifestation is 1.1% (Keep executing the commands, and it occurs two times out of 178 repeatedly executions). Start up HBase 2.4.17 cluster (2.10.2 HDFS, 1 HMaster, 2 RS) Executing the following commands, and then perform the read {code:java} create_namespace 'uuidef8e6005b9e74092927a4b335424f7c5' create 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuid07f904a09baf414d903b2818d3091f28', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid085c0991aa1c4d4ea145767e7e7bf60c', VERSIONS => 4, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEM
[jira] [Created] (HBASE-28125) list_quota_table_sizes returns a table whose quota is not set
Ke Han created HBASE-28125: -- Summary: list_quota_table_sizes returns a table whose quota is not set Key: HBASE-28125 URL: https://issues.apache.org/jira/browse/HBASE-28125 Project: HBase Issue Type: Bug Components: Quotas Affects Versions: 2.4.17 Reporter: Ke Han When using HBase cluster 2.4.17, I noticed that the list_quota_table_sizes sometimes could return a table whose quota is not set. h1. Reproduce {_}This bug cannot be reproduced deterministically{_}. The probability for its manifestation is 1.1% (Keep executing the commands, and it occurs two times out of 178 repeatedly executions). Start up HBase 2.4.17 cluster (2.10.2 HDFS, 1 HMaster, 2 RS) Executing the following commands, and then perform the read {code:java} create_namespace 'uuidef8e6005b9e74092927a4b335424f7c5' create 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuid07f904a09baf414d903b2818d3091f28', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid085c0991aa1c4d4ea145767e7e7bf60c', VERSIONS => 4, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuid1d9a2bc405c64708b9e471ae14794741', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} create 'uuid5cafa12ce5034015bb597428b294a40d', {NAME => 'uuid7d9efb39ac94472b90dc60ed3723cdf9', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 'uuidde273134b6434fc584990554cfa64b10', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'} clone_table_schema 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuid5cafa12ce5034015bb597428b294a40d' truncate 'uuidb8e2393af3314726890b70ef5871a9d0' compaction_state 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' truncate_preserve 'uuidb8e2393af3314726890b70ef5871a9d0' drop 'uuidb8e2393af3314726890b70ef5871a9d0' truncate 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' alter 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', {NAME => 'uuida60aaa69834e4f0596b5b3b3c12b2cb8', METHOD => 'delete'} disable 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' incr 'uuid80d6aa3495094cd7b4018d3ab3fe9db8', 'uuid863efa8e4f1f44888af0ed139effba33', 'uuid085c0991aa1c4d4ea145767e7e7bf60c:NONE', 3 drop 'uuid80d6aa3495094cd7b4018d3ab3fe9db8' wal_roll 'hregion2,16020' wal_roll 'hregion1,16020' create 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuidd64032ff2e7340fb8832a16430fc14c1', VERSIONS => 3, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidc4be1958501543ac86661793a4c144cb', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 'uuid0c7961e7f67a464387f9f5ce428f08d1', VERSIONS => 4, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'} major_compact 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuid0c7961e7f67a464387f9f5ce428f08d1' incr 'uuid4323f716aea24b5fa001f0722cdc66f9', 'uuidffb29730a2fb4f5f8c18f0c1bc254402', 'uuida9f7a62b76fa4560834cc5789d5abf3a:cc', 3 alter 'uuid4323f716aea24b5fa001f0722cdc66f9', {NAME => 'uuida9f7a62b76fa4560834cc5789d5abf3a', METHOD => 'delete'} update_config 'hregion1,16020' {code} Then execute read command in hbase shell {code:java} list_quota_table_sizes TABLE SIZE uuid4323f716aea24b5fa001f0722cdc66f9 5133 1 row(s) Took 0.0278 seconds {code} uuid4323f716aea24b5fa001f0722cdc66f9 is not set in the quota table but it still appears in in list_quota_table_sizes results also with a strange value: 5133. h1. Thoughts The root cause might be related to how *quota* table reacts when a new table is created. I am still investigating the root cause (Injecting logs to MasterQuotaManager to understand why it also records this table). Is this normal behavior for list_quota_table_sizes? If this is abnormal behavior, I can give it a try for fixing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28105) NPE in QuotaCache if Table is dropped from cluster
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han resolved HBASE-28105. Resolution: Fixed PR merged > NPE in QuotaCache if Table is dropped from cluster > -- > > Key: HBASE-28105 > URL: https://issues.apache.org/jira/browse/HBASE-28105 > Project: HBase > Issue Type: Bug > Components: Quotas >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Attachments: 0001-avoid-NPE.patch, > hbase--regionserver-a0320910ca45.log > > > When running HBase-2.4.17, I met a NPE in regionserver log. > h1. Reproduce > Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. > Execute the following commands in the HMaster node using hbase shell, > {code:java} > create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => > 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => > 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => > 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} > create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' > create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => > 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} > incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid46ddc3d3557e413e915e2393ae72c082', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 > flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid449de028da6b4d35be0f187ebec6c3be' > drop 'uuiddeb610fded9744889840ecd03dd18739' > put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuidf4704cae4d1e4661bd7664d26eb6b31b', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', > 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' > disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => > 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => > 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} > put 'uuid9d05a5cb34e64910ac90675186e7d0d4', > 'uuid552e42ade4c14099a1d8643bea1616d4', > 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 > drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} > The exception will be thrown in either RS1 or RS2 > {code:java} > 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] > handler.AssignRegionHandler: Opened > uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. > 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] > hbase.ScheduledChore: Caught error > java.lang.NullPointerException > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > h1. Root Cause > The NPE is thrown at function: updateQuotaFactors() > {code:java} > private void updateQuotaFactors() { > // Update machine quota factor > ClusterMetrics clusterMetrics; > try { > clusterMetrics = rsServices.getConnection().getAdmin() > .getClust
[jira] [Updated] (HBASE-28109) NPE for the region state: Failed to become active master (HMaster)
[ https://issues.apache.org/jira/browse/HBASE-28109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28109: --- Description: When starting up HBase cluster (2.4.17), I met NPE and it prevents HMaster from starting up. I have to restart the HMaster. My cluster contains 1 HMaster, 2 RS (HBase-2.4.17) and 1 Hadoop node (2.10.2). {code:java} 2023-09-18 14:17:35,931 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4660 sec 2023-09-18 14:17:35,931 INFO [master/hmaster:16000:becomeActiveMaster] master.HMaster: Wait for region servers to report in: status=null, state=RUNNING, startTime=1695046655931, completionTime=-1 2023-09-18 14:17:35,932 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=0ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=0ms 2023-09-18 14:17:37,438 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=1505ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=1505ms 2023-09-18 14:17:38,941 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=3009ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=3009ms 2023-09-18 14:17:40,445 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Finished waiting on RegionServer count=2; waited=4513ms, expected min=1 server(s), max=NO_LIMIT server(s), master is running 2023-09-18 14:17:40,452 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.isRegionOnline(HMaster.java:1229) at org.apache.hadoop.hbase.master.HMaster.waitForMetaOnline(HMaster.java:1218) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:968) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) 2023-09-18 14:17:40,453 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] {code} h1. Root Cause >From the stack trace, the rs variable is NULL and it's directly used without >checking. {code:java} // hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /** * @return True if region is online and scannable else false if an error or shutdown (Otherwise we * just block in here holding up all forward-progess). */ private boolean isRegionOnline(RegionInfo ri) { RetryCounter rc = null; while (!isStopped()) { // NPE line RegionState rs = this.assignmentManager.getRegionStates().getRegionState(ri); if (rs.isOpened()) { if (this.getServerManager().isServerOnline(rs.getServerName())) { return true; } } // Region{code} I am not sure what causes the *rs* to be null but maybe we can add a check to make sure this NPE is captured and properly handled. Restart the HMaster and this exception will disappear. I have attached the full log from HMaster for this case. I run into this exception when using HBase 2.4.17 but I think it might also happen in the latest branch since the code of isRegionOnline is the same. h1. Fix This bug happens rarely. I think we can add a simple check to know whether rs is null and then decide whether to keep waiting or directly shutdown the HMaster. I assume that if HMaster wait for more time, it will get correct responses from regionservers. I have a simple PR to fix it. https://github.com/apache/hbase/pull/5432 was: When starting up HBase cluster (2.4.17), I met NPE and it prevents HMaster from starting up. I have to restart the HMaster. My cluster contains 1 HMaster, 2 RS (HBase-2.4.17) and 1 Hadoop node (2.10.2). {code:java} 2023-09-18 14:17:35,931 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4660 sec 2023-09-18 14:17:35,931 INFO [master/hmaster:16000:becomeActiveMaster] master.HMaster: Wait for region servers to report in: status=null, state=RUNNING, startTime=1695046655931, completionTime=-1 2023-09-18 14:17:35,9
[jira] [Updated] (HBASE-28109) NPE for the region state: Failed to become active master (HMaster)
[ https://issues.apache.org/jira/browse/HBASE-28109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28109: --- Description: When starting up HBase cluster (2.4.17), I met NPE and it prevents HMaster from starting up. I have to restart the HMaster. My cluster contains 1 HMaster, 2 RS (HBase-2.4.17) and 1 Hadoop node (2.10.2). {code:java} 2023-09-18 14:17:35,931 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4660 sec 2023-09-18 14:17:35,931 INFO [master/hmaster:16000:becomeActiveMaster] master.HMaster: Wait for region servers to report in: status=null, state=RUNNING, startTime=1695046655931, completionTime=-1 2023-09-18 14:17:35,932 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=0ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=0ms 2023-09-18 14:17:37,438 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=1505ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=1505ms 2023-09-18 14:17:38,941 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=3009ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=3009ms 2023-09-18 14:17:40,445 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Finished waiting on RegionServer count=2; waited=4513ms, expected min=1 server(s), max=NO_LIMIT server(s), master is running 2023-09-18 14:17:40,452 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.isRegionOnline(HMaster.java:1229) at org.apache.hadoop.hbase.master.HMaster.waitForMetaOnline(HMaster.java:1218) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:968) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) 2023-09-18 14:17:40,453 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] {code} h1. Root Cause >From the stack trace, the rs variable is NULL and it's directly used without >checking. {code:java} // hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /** * @return True if region is online and scannable else false if an error or shutdown (Otherwise we * just block in here holding up all forward-progess). */ private boolean isRegionOnline(RegionInfo ri) { RetryCounter rc = null; while (!isStopped()) { // NPE line RegionState rs = this.assignmentManager.getRegionStates().getRegionState(ri); if (rs.isOpened()) { if (this.getServerManager().isServerOnline(rs.getServerName())) { return true; } } // Region{code} I am not sure what causes the *rs* to be null but maybe we can add a check to make sure this NPE is captured and properly handled. Restart the HMaster and this exception will disappear. I have attached the full log from HMaster for this case. I run into this exception when using HBase 2.4.17 but I think it might also happen in the latest branch since the code of isRegionOnline is the same. h1. Fix This bug happens rarely. I think we can add a simple check to know whether rs is null and then decide whether to keep waiting or directly shutdown the HMaster. I assume that if HMaster wait for more time, it will get correct responses from regionservers. I have a simple PR to fix it. was: When starting up HBase cluster (2.4.17), I met NPE and it prevents HMaster from starting up. I have to restart the HMaster. My cluster contains 1 HMaster, 2 RS (HBase-2.4.17) and 1 Hadoop node (2.10.2). {code:java} 2023-09-18 14:17:35,931 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4660 sec 2023-09-18 14:17:35,931 INFO [master/hmaster:16000:becomeActiveMaster] master.HMaster: Wait for region servers to report in: status=null, state=RUNNING, startTime=1695046655931, completionTime=-1 2023-09-18 14:17:35,932 INFO [master/hmaster:16000:becomeAct
[jira] [Created] (HBASE-28109) NPE for the region state: Failed to become active master (HMaster)
Ke Han created HBASE-28109: -- Summary: NPE for the region state: Failed to become active master (HMaster) Key: HBASE-28109 URL: https://issues.apache.org/jira/browse/HBASE-28109 Project: HBase Issue Type: Bug Affects Versions: 2.4.17 Reporter: Ke Han Attachments: hbase--master-ee4a85363fe2.log When starting up HBase cluster (2.4.17), I met NPE and it prevents HMaster from starting up. I have to restart the HMaster. My cluster contains 1 HMaster, 2 RS (HBase-2.4.17) and 1 Hadoop node (2.10.2). {code:java} 2023-09-18 14:17:35,931 INFO [PEWorker-1] procedure2.ProcedureExecutor: Rolled back pid=1, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.exceptions.TimeoutIOException via ProcedureExecutor:org.apache.hadoop.hbase.exceptions.TimeoutIOException: Operation timed out after 1.0010 sec; InitMetaProcedure table=hbase:meta exec-time=1.4660 sec 2023-09-18 14:17:35,931 INFO [master/hmaster:16000:becomeActiveMaster] master.HMaster: Wait for region servers to report in: status=null, state=RUNNING, startTime=1695046655931, completionTime=-1 2023-09-18 14:17:35,932 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=0ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=0ms 2023-09-18 14:17:37,438 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=1505ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=1505ms 2023-09-18 14:17:38,941 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Waiting on regionserver count=2; waited=3009ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=4500ms, lastChange=3009ms 2023-09-18 14:17:40,445 INFO [master/hmaster:16000:becomeActiveMaster] master.ServerManager: Finished waiting on RegionServer count=2; waited=4513ms, expected min=1 server(s), max=NO_LIMIT server(s), master is running 2023-09-18 14:17:40,452 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.isRegionOnline(HMaster.java:1229) at org.apache.hadoop.hbase.master.HMaster.waitForMetaOnline(HMaster.java:1218) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:968) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2193) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:750) 2023-09-18 14:17:40,453 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver] {code} h1. Root Cause >From the stack trace, the rs variable is NULL and it's directly used without >checking. {code:java} // hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /** * @return True if region is online and scannable else false if an error or shutdown (Otherwise we * just block in here holding up all forward-progess). */ private boolean isRegionOnline(RegionInfo ri) { RetryCounter rc = null; while (!isStopped()) { // NPE line RegionState rs = this.assignmentManager.getRegionStates().getRegionState(ri); if (rs.isOpened()) { if (this.getServerManager().isServerOnline(rs.getServerName())) { return true; } } // Region{code} I am not sure what causes the rs to be null but maybe we can add a check to make sure this NPE is captured and properly handled. Restart the HMaster and this exception will disappear. I have attached the full log from HMaster for this case. I run into this exception when using HBase 2.4.17 but I think it might also happen in the latest branch since the code of isRegionOnline is the same. h1. Fix This bug happens rarely. I think we can add a simple check to know whether rs is null and then decide whether to keep waiting or directly shutdown the HMaster. I assume that if HMaster wait for more time, it will get correct responses from regionservers. I have a simple PR to fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28105) NPE in QuotaCache if Table is dropped from cluster
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767613#comment-17767613 ] Ke Han commented on HBASE-28105: [~bbeaudreault] Thanks for the reply! I have submitted the PR: [https://github.com/apache/hbase/pull/5426/files.|https://github.com/apache/hbase/pull/5426/files] > NPE in QuotaCache if Table is dropped from cluster > -- > > Key: HBASE-28105 > URL: https://issues.apache.org/jira/browse/HBASE-28105 > Project: HBase > Issue Type: Bug > Components: Quotas >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Attachments: 0001-avoid-NPE.patch, > hbase--regionserver-a0320910ca45.log > > > When running HBase-2.4.17, I met a NPE in regionserver log. > h1. Reproduce > Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. > Execute the following commands in the HMaster node using hbase shell, > {code:java} > create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => > 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => > 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => > 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} > create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' > create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => > 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} > incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid46ddc3d3557e413e915e2393ae72c082', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 > flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid449de028da6b4d35be0f187ebec6c3be' > drop 'uuiddeb610fded9744889840ecd03dd18739' > put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuidf4704cae4d1e4661bd7664d26eb6b31b', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', > 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' > disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => > 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => > 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} > put 'uuid9d05a5cb34e64910ac90675186e7d0d4', > 'uuid552e42ade4c14099a1d8643bea1616d4', > 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 > drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} > The exception will be thrown in either RS1 or RS2 > {code:java} > 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] > handler.AssignRegionHandler: Opened > uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. > 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] > hbase.ScheduledChore: Caught error > java.lang.NullPointerException > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > h1. Root Cause > The NPE is thrown at function: updateQuotaFactors() > {code:java} > private
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Description: When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} The exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at function: updateQuotaFactors() {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clusterMetrics.getTableRegionStatesCount(); // Update table machine quota factors for (TableName tableName : tableQuotaCache.keySet()) { double factor = 1; try { // BUGGY LINE long regionSize = tableRegi
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Description: When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} The exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at function: updateQuotaFactors() {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clusterMetrics.getTableRegionStatesCount(); // Update table machine quota factors for (TableName tableName : tableQuotaCache.keySet()) { double factor = 1; try { // BUGGY LINE long regionSize = tableRegi
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Description: When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} The exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at function: updateQuotaFactors() {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clusterMetrics.getTableRegionStatesCount(); // Update table machine quota factors for (TableName tableName : tableQuotaCache.keySet()) { double factor = 1; try { long regionSize = tableRegionStatesCount.get(ta
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Flags: (was: Patch) > NPE is thrown in QuotaCache.java when running HBase-2.4.17 > -- > > Key: HBASE-28105 > URL: https://issues.apache.org/jira/browse/HBASE-28105 > Project: HBase > Issue Type: Bug > Components: Quotas >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Attachments: 0001-avoid-NPE.patch, > hbase--regionserver-a0320910ca45.log > > > When running HBase-2.4.17, I met a NPE in regionserver log. > h1. Reproduce > Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. > Execute the following commands in the HMaster node using hbase shell, > {code:java} > create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => > 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => > 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => > 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} > create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' > create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => > 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} > incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid46ddc3d3557e413e915e2393ae72c082', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 > flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid449de028da6b4d35be0f187ebec6c3be' > drop 'uuiddeb610fded9744889840ecd03dd18739' > put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuidf4704cae4d1e4661bd7664d26eb6b31b', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', > 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' > disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => > 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => > 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} > put 'uuid9d05a5cb34e64910ac90675186e7d0d4', > 'uuid552e42ade4c14099a1d8643bea1616d4', > 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 > drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} > The exception will be thrown in either RS1 or RS2 > {code:java} > 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] > handler.AssignRegionHandler: Opened > uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. > 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] > hbase.ScheduledChore: Caught error > java.lang.NullPointerException > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > h1. Root Cause > The NPE is thrown at function: updateQuotaFactors() > {code:java} > private void updateQuotaFactors() { > // Update machine quota factor > ClusterMetrics clusterMetrics; > try { > clusterMetrics = rsServices.getConnection().getAdmin() > .g
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Flags: Patch > NPE is thrown in QuotaCache.java when running HBase-2.4.17 > -- > > Key: HBASE-28105 > URL: https://issues.apache.org/jira/browse/HBASE-28105 > Project: HBase > Issue Type: Bug > Components: Quotas >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Attachments: 0001-avoid-NPE.patch, > hbase--regionserver-a0320910ca45.log > > > When running HBase-2.4.17, I met a NPE in regionserver log. > h1. Reproduce > Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. > Execute the following commands in the HMaster node using hbase shell, > {code:java} > create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => > 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => > 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => > 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} > create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' > create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => > 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} > incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid46ddc3d3557e413e915e2393ae72c082', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 > flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid449de028da6b4d35be0f187ebec6c3be' > drop 'uuiddeb610fded9744889840ecd03dd18739' > put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuidf4704cae4d1e4661bd7664d26eb6b31b', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', > 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' > disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => > 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => > 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} > put 'uuid9d05a5cb34e64910ac90675186e7d0d4', > 'uuid552e42ade4c14099a1d8643bea1616d4', > 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 > drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} > The exception will be thrown in either RS1 or RS2 > {code:java} > 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] > handler.AssignRegionHandler: Opened > uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. > 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] > hbase.ScheduledChore: Caught error > java.lang.NullPointerException > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > h1. Root Cause > The NPE is thrown at function: updateQuotaFactors() > {code:java} > private void updateQuotaFactors() { > // Update machine quota factor > ClusterMetrics clusterMetrics; > try { > clusterMetrics = rsServices.getConnection().getAdmin() > .getCluster
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Attachment: hbase--regionserver-a0320910ca45.log > NPE is thrown in QuotaCache.java when running HBase-2.4.17 > -- > > Key: HBASE-28105 > URL: https://issues.apache.org/jira/browse/HBASE-28105 > Project: HBase > Issue Type: Bug > Components: Quotas >Affects Versions: 2.4.17, 2.5.5 >Reporter: Ke Han >Priority: Major > Attachments: 0001-avoid-NPE.patch, > hbase--regionserver-a0320910ca45.log > > > When running HBase-2.4.17, I met a NPE in regionserver log. > h1. Reproduce > Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. > Execute the following commands in the HMaster node using hbase shell, > {code:java} > create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => > 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => > 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => > 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => > 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} > create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' > create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => > 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', > BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} > incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid46ddc3d3557e413e915e2393ae72c082', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 > flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuid449de028da6b4d35be0f187ebec6c3be' > drop 'uuiddeb610fded9744889840ecd03dd18739' > put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', > 'uuidf4704cae4d1e4661bd7664d26eb6b31b', > 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', > 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' > disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' > create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => > 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => > 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', > BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} > put 'uuid9d05a5cb34e64910ac90675186e7d0d4', > 'uuid552e42ade4c14099a1d8643bea1616d4', > 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 > drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} > The exception will be thrown in either RS1 or RS2 > {code:java} > 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] > handler.AssignRegionHandler: Opened > uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. > 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] > hbase.ScheduledChore: Caught error > java.lang.NullPointerException > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) > at > org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){code} > h1. Root Cause > The NPE is thrown at function: updateQuotaFactors() > {code:java} > private void updateQuotaFactors() { > // Update machine quota factor > ClusterMetrics clusterMetrics; > try { > clusterMetrics = rsServices.getConnect
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Description: When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} The exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at function: updateQuotaFactors() {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clusterMetrics.getTableRegionStatesCount(); // Update table machine quota factors for (TableName tableName : tableQuotaCache.keySet()) { double factor = 1; try { long regionSize = tableRegionStatesCount.get(ta
[jira] [Updated] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
[ https://issues.apache.org/jira/browse/HBASE-28105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Han updated HBASE-28105: --- Description: When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} The exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at function: updateQuotaFactors() {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clusterMetrics.getTableRegionStatesCount(); // Update table machine quota factors for (TableName tableName : tableQuotaCache.keySet()) { double factor = 1; try { long regionSize = tableRegionStatesCount.get(ta
[jira] [Created] (HBASE-28105) NPE is thrown in QuotaCache.java when running HBase-2.4.17
Ke Han created HBASE-28105: -- Summary: NPE is thrown in QuotaCache.java when running HBase-2.4.17 Key: HBASE-28105 URL: https://issues.apache.org/jira/browse/HBASE-28105 Project: HBase Issue Type: Bug Components: Quotas Affects Versions: 2.5.5, 2.4.17 Reporter: Ke Han Attachments: 0001-avoid-NPE.patch When running HBase-2.4.17, I met a NPE in regionserver log. h1. Reproduce Config HBase cluster: 1 HMaster, 2 RS, 2.10.2 Hadoop. Execute the following commands in the HMaster node using hbase shell, {code:java} create 'uuidd9efa97f93a442b686adae6d9f7bb2e9', {NAME => 'uuid099cbece77834a83a52bb0611c3ea080', VERSIONS => 3, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 'uuidbc1bea73952749329d7f025aab382c4e', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidff292310d9dc450697af2bb25d9f3e98', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 'uuid449de028da6b4d35be0f187ebec6c3be', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 'uuidc0840c98f9d348a18f2d454c7a503b65', VERSIONS => 2, COMPRESSION => 'GZ', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'} create_namespace 'uuidec797633f5dd4ab9b96276135aeda9e2' create 'uuiddeb610fded9744889840ecd03dd18739', {NAME => 'uuid30a0f625ad454605908b60c932957ff0', VERSIONS => 1, COMPRESSION => 'GZ', BLOOMFILTER => 'ROW', IN_MEMORY => 'true'} incr 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid46ddc3d3557e413e915e2393ae72c082', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 1 flush 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuid449de028da6b4d35be0f187ebec6c3be' drop 'uuiddeb610fded9744889840ecd03dd18739' put 'uuidd9efa97f93a442b686adae6d9f7bb2e9', 'uuidf4704cae4d1e4661bd7664d26eb6b31b', 'uuidbc1bea73952749329d7f025aab382c4e:JZycbUSpbDQmwgXinp', 'XlPpFGvSYfcEXWXgwARytlSeiaSuHJFqpirMmLduqGnpdXLlHJWBumraXiifQSvHqNHmTcyzLQIvuQrkujPghfdtRkhOkgKEJHsAuAiMMeWZjdTHNZqhkOdJBOzsRYUXKOCNKeSxEDWgnKgsFDHMtxdnKKudBuceOgYmCrdaPXMclKkZKCIEiFDcdoAEJGKXYVfOjb' disable 'uuidd9efa97f93a442b686adae6d9f7bb2e9' drop 'uuidd9efa97f93a442b686adae6d9f7bb2e9' create 'uuid9d05a5cb34e64910ac90675186e7d0d4', {NAME => 'uuid1ce512a5997b4efea3bdead2e7f723c3', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 'uuid0b1baaa4275e46b2a3a1d11d6540fc30', VERSIONS => 2, COMPRESSION => 'NONE', BLOOMFILTER => 'NONE', IN_MEMORY => 'true'} put 'uuid9d05a5cb34e64910ac90675186e7d0d4', 'uuid552e42ade4c14099a1d8643bea1616d4', 'uuid1ce512a5997b4efea3bdead2e7f723c3:l', 1 drop 'uuid9d05a5cb34e64910ac90675186e7d0d4'{code} Then the exception will be thrown in either RS1 or RS2 {code:java} 2023-09-19 20:29:28,268 INFO [RS_OPEN_REGION-regionserver/hregion2:16020-2] handler.AssignRegionHandler: Opened uuid9d05a5cb34e64910ac90675186e7d0d4,,1695155367072.f59a0693a9469f9e1f131bf2aac1486d. 2023-09-19 20:29:29,205 ERROR [regionserver/hregion2:16020.Chore.1] hbase.ScheduledChore: Caught error java.lang.NullPointerException at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:378) at org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:224) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750){code} h1. Root Cause The NPE is thrown at {code:java} private void updateQuotaFactors() { // Update machine quota factor ClusterMetrics clusterMetrics; try { clusterMetrics = rsServices.getConnection().getAdmin() .getClusterMetrics(EnumSet.of(Option.SERVERS_NAME, Option.TABLE_TO_REGIONS_COUNT)); } catch (IOException e) { LOG.warn("Failed to get cluster metrics needed for updating quotas", e); return; } int rsSize = clusterMetrics.getServersName().size(); if (rsSize != 0) { // TODO if use rs group, the cluster limit should be shared by the rs group machineQuotaFactor = 1.0 / rsSize; } Map tableRegionStatesCount = clu