Andrew Purtell created HBASE-20895:
--------------------------------------

             Summary: NPE in RpcServer#readAndProcess
                 Key: HBASE-20895
                 URL: https://issues.apache.org/jira/browse/HBASE-20895
             Project: HBase
          Issue Type: Bug
          Components: rpc
    Affects Versions: 1.3.2
            Reporter: Andrew Purtell
            Assignee: Monani Mihir
             Fix For: 1.5.0, 1.3.3, 1.4.6


{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.

{quote} 
    count = channelRead(channel, data);
1761 --->   if (count >= 0 && *data.remaining()* == 0) { // count==0 if 
dataLength == 0
    process();
   }
{quote} 

Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether the concurrent access to the Connection is intended.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to