Ke Han created HBASE-28590: ------------------------------ Summary: NPE after upgrade from 2.5.8 to 3.0.0 Key: HBASE-28590 URL: https://issues.apache.org/jira/browse/HBASE-28590 Project: HBase Issue Type: Bug Components: master Affects Versions: 3.0.0 Reporter: Ke Han Attachments: commands.txt, hbase--master-fc906f1808de.log, persistent.tar.gz
When upgrade hbase cluster from 2.5.8 to 3.0.0 (commit: 516c89e8597fb6), I met the following NPE in master log. {code:java} 2024-05-11T02:17:47,293 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,326 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] 2024-05-11T02:17:47,337 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException: null at org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463) ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]{code} h1. Reproduce This bug cannot be reproduced deterministically but it happens pretty frequently (10% to trigger with the following steps. 1. Start up 2.5.8 cluster with default configuration (1 HM, 2 RS, 1 HDFS) 2. Execute the commands in commands.txt 3. Stop the 2.5.8 cluster and upgrade to 3.0.0 cluster with default configuration (commit: 516c89e8597fb6, 1 HM, 2 RS, 1 HDFS) The error message will occur in master log. I attached (1) commands to reproduce it (2) master log and (3) full error logs of all nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)