[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
-----------------------------------
    Attachment: HBASE-20895-branch-1.patch

> NPE in RpcServer#readAndProcess
> -------------------------------
>
>                 Key: HBASE-20895
>                 URL: https://issues.apache.org/jira/browse/HBASE-20895
>             Project: HBase
>          Issue Type: Bug
>          Components: rpc
>    Affects Versions: 1.3.2
>            Reporter: Andrew Purtell
>            Assignee: Monani Mihir
>            Priority: Major
>             Fix For: 1.5.0, 1.3.3, 1.4.7
>
>         Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to