[ 
https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4890:
-------------------------

    Attachment: 4890.txt

The NPE is happening in j-d's artificial case because we're doing a bulk open 
of 3k regions and its taking a little while to complete; i.e. > than the rpc 
timeout.  There is no error though becaues this is a client running in the 
master and its connecting to a single regionserver old doing meta scans in the 
meantime etc. updating last activity on the connection... so we're not running 
into a socket timeout which it looks like the expectation is here... that there 
MUST be an exception outstanding if Call has been running for > rpctimeout.

Cosmin sees the exact stacktrace that Jon originally uploaded so we'll try this 
patch on his cluster (Cosmin also speculates this NPE happens only in the 
extreme, in ycsb or open 3k regions kinda extremes. He is seeing it only when 
he does extreme load test on his cluster)
                
> fix possible NPE in HConnectionManager
> --------------------------------------
>
>                 Key: HBASE-4890
>                 URL: https://issues.apache.org/jira/browse/HBASE-4890
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jonathan Hsieh
>            Priority: Blocker
>             Fix For: 0.92.1
>
>         Attachments: 4890.txt, splits.txt
>
>
> I was running YCSB against a 0.92 branch and encountered this error message:
> {code}
> 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: 
> Failed all from 
> region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41.,
>  hostname=c0316.hal.cloudera.com, port=57020
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.lang.NullPointerException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898)
>         at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750)
>         at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source)
>         at com.yahoo.ycsb.DBWrapper.update(Unknown Source)
>         at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown 
> Source)
>         at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source)
>         at com.yahoo.ycsb.ClientThread.run(Unknown Source)
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158)
>         at $Proxy4.multi(Unknown Source)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309)
>         ... 7 more
> {code}
> It looks like the NPE is caused by server being null in the MultiRespone 
> call() method.
> {code}
>      public MultiResponse call() throws IOException {
>          return getRegionServerWithoutRetries(
>              new ServerCallable<MultiResponse>(connection, tableName, null) {
>                public MultiResponse call() throws IOException {
>                  return server.multi(multi);
>                }
>                @Override
>                public void connect(boolean reload) throws IOException {
>                  server =
>                    connection.getHRegionConnection(loc.getHostname(), 
> loc.getPort());
>                }
>              }
>          );
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to