[
https://issues.apache.org/jira/browse/HBASE-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16327873#comment-16327873
]
Sergey Soldatov commented on HBASE-19805:
-----------------------------------------
[~stack] almost. Actually, that happens when one split is already in progress.
We are closing parent (so, we are setting location to null) and if at this time
we try to check whether this region is splittable, we hit this problem. I'm not
sure yet why we allow to split the same region many times. From my log
scheduled splits:
{noformat}
parent=af7ddfb3943627b825ddfe3fedb27590,
daughterA=fde4b311dd76909e05cd57f2d19a8ebc,
daughterB=4381a4dd9da46c6e7ce91ab6419fb708
parent=af7ddfb3943627b825ddfe3fedb27590,
daughterA=38d5fffbe693017be0d2fcc97eec3e3e,
daughterB=e76bfc1dddfca511c00dfd3477dc003d
parent=af7ddfb3943627b825ddfe3fedb27590,
daughterA=ef89483bc4117a31536e1c25def4f64e,
daughterB=ccf1c2f5f43f478af438b6c3f2ca7ef5
{noformat}
And only the first one is actually happen.
> NPE in HMaster while issuing a sequence of table splits
> -------------------------------------------------------
>
> Key: HBASE-19805
> URL: https://issues.apache.org/jira/browse/HBASE-19805
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 2.0.0-beta-1
> Reporter: Josh Elser
> Assignee: Sergey Soldatov
> Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> I wrote a toy program to test the client tarball in HBASE-19735. After the
> first few region splits, I see the following error in the Master log.
> {noformat}
> 2018-01-16 14:07:52,797 INFO
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] master.HMaster:
> Client=jelser//192.168.1.23 split
> myTestTable,1,1516129669054.8313b755f74092118f9dd30a4190ee23.
> 2018-01-16 14:07:52,797 ERROR
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] ipc.RpcServer:
> Unexpected throwable object
> java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:229)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.getAdmin(ConnectionImplementation.java:1175)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getAdmin(ConnectionUtils.java:149)
> at
> org.apache.hadoop.hbase.master.assignment.Util.getRegionInfoResponse(Util.java:59)
> at
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkSplittable(SplitTableRegionProcedure.java:146)
> at
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.<init>(SplitTableRegionProcedure.java:103)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createSplitProcedure(AssignmentManager.java:761)
> at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1626)
> at
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:134)
> at org.apache.hadoop.hbase.master.HMaster.splitRegion(HMaster.java:1618)
> at
> org.apache.hadoop.hbase.master.MasterRpcServices.splitRegion(MasterRpcServices.java:778)
> at
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {noformat}
> {code}
> public static void main(String[] args) throws Exception {
> Configuration conf = HBaseConfiguration.create();
> try (Connection conn = ConnectionFactory.createConnection(conf);
> Admin admin = conn.getAdmin()) {
> final TableName tn = TableName.valueOf("myTestTable");
> if (admin.tableExists(tn)) {
> admin.disableTable(tn);
> admin.deleteTable(tn);
> }
> final TableDescriptor desc = TableDescriptorBuilder.newBuilder(tn)
>
> .addColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("f1")).build())
> .build();
> admin.createTable(desc);
> List<String> splitPoints = new ArrayList<>(16);
> for (int i = 1; i <= 16; i++) {
> splitPoints.add(Integer.toString(i, 16));
> }
>
> System.out.println("Splits: " + splitPoints);
> int numRegions = admin.getRegions(tn).size();
> for (String splitPoint : splitPoints) {
> System.out.println("Splitting on " + splitPoint);
> admin.split(tn, Bytes.toBytes(splitPoint));
> Thread.sleep(200);
> int newRegionSize = admin.getRegions(tn).size();
> while (numRegions == newRegionSize) {
> Thread.sleep(50);
> newRegionSize = admin.getRegions(tn).size();
> }
> }
> {code}
> A quick glance, looks like {{Util.getRegionInfoResponse}} is to blame.
> {code}
> static GetRegionInfoResponse getRegionInfoResponse(final MasterProcedureEnv
> env,
> final ServerName regionLocation, final RegionInfo hri, boolean
> includeBestSplitRow)
> throws IOException {
> // TODO: There is no timeout on this controller. Set one!
> HBaseRpcController controller =
> env.getMasterServices().getClusterConnection().
> getRpcControllerFactory().newController();
> final AdminService.BlockingInterface admin =
>
> env.getMasterServices().getClusterConnection().getAdmin(regionLocation);
> {code}
> We don't validate that we have a non-null {{ServerName regionLocation}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)