[ 
https://issues.apache.org/jira/browse/HBASE-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367941#comment-16367941
 ] 

Sergey Soldatov commented on HBASE-19805:
-----------------------------------------

[~stack] Sorry, missed the previous notification. Last time I ended up with 
some weird scenarios where everything get stuck without any visible problems 
and didn't have a chance to dig it due $dayjob. Will try to get back to this 
stuff during the weekend. 

> NPE in HMaster while issuing a sequence of table splits
> -------------------------------------------------------
>
>                 Key: HBASE-19805
>                 URL: https://issues.apache.org/jira/browse/HBASE-19805
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 2.0.0-beta-1
>            Reporter: Josh Elser
>            Assignee: Sergey Soldatov
>            Priority: Critical
>             Fix For: 2.0.0
>
>
> I wrote a toy program to test the client tarball in HBASE-19735. After the 
> first few region splits, I see the following error in the Master log. 
> {noformat}
> 2018-01-16 14:07:52,797 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] master.HMaster: 
> Client=jelser//192.168.1.23 split 
> myTestTable,1,1516129669054.8313b755f74092118f9dd30a4190ee23.
> 2018-01-16 14:07:52,797 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] ipc.RpcServer: 
> Unexpected throwable object
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:229)
>       at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getAdmin(ConnectionImplementation.java:1175)
>       at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getAdmin(ConnectionUtils.java:149)
>       at 
> org.apache.hadoop.hbase.master.assignment.Util.getRegionInfoResponse(Util.java:59)
>       at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkSplittable(SplitTableRegionProcedure.java:146)
>       at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.<init>(SplitTableRegionProcedure.java:103)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createSplitProcedure(AssignmentManager.java:761)
>       at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1626)
>       at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:134)
>       at org.apache.hadoop.hbase.master.HMaster.splitRegion(HMaster.java:1618)
>       at 
> org.apache.hadoop.hbase.master.MasterRpcServices.splitRegion(MasterRpcServices.java:778)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {noformat}
> {code}
>   public static void main(String[] args) throws Exception {
>     Configuration conf = HBaseConfiguration.create();
>     try (Connection conn = ConnectionFactory.createConnection(conf);
>         Admin admin = conn.getAdmin()) {
>       final TableName tn = TableName.valueOf("myTestTable");
>       if (admin.tableExists(tn)) {
>         admin.disableTable(tn);
>         admin.deleteTable(tn);
>       }
>       final TableDescriptor desc = TableDescriptorBuilder.newBuilder(tn)
>           
> .addColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("f1")).build())
>           .build();
>       admin.createTable(desc);
>       List<String> splitPoints = new ArrayList<>(16);
>       for (int i = 1; i <= 16; i++) {
>         splitPoints.add(Integer.toString(i, 16));
>       }
>       
>       System.out.println("Splits: " + splitPoints);
>       int numRegions = admin.getRegions(tn).size();
>       for (String splitPoint : splitPoints) {
>         System.out.println("Splitting on " + splitPoint);
>         admin.split(tn, Bytes.toBytes(splitPoint));
>         Thread.sleep(200);
>         int newRegionSize = admin.getRegions(tn).size();
>         while (numRegions == newRegionSize) {
>           Thread.sleep(50);
>           newRegionSize = admin.getRegions(tn).size();
>         }
>       }
> {code}
> A quick glance, looks like {{Util.getRegionInfoResponse}} is to blame.
> {code}
>   static GetRegionInfoResponse getRegionInfoResponse(final MasterProcedureEnv 
> env,
>       final ServerName regionLocation, final RegionInfo hri, boolean 
> includeBestSplitRow)
>   throws IOException {
>     // TODO: There is no timeout on this controller. Set one!
>     HBaseRpcController controller = 
> env.getMasterServices().getClusterConnection().
>         getRpcControllerFactory().newController();
>     final AdminService.BlockingInterface admin =
>         
> env.getMasterServices().getClusterConnection().getAdmin(regionLocation);
> {code}
> We don't validate that we have a non-null {{ServerName regionLocation}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to