Thanks for the analysis. 

Do you mind opening a Jira ?



On Jan 10, 2012, at 7:51 AM, Yves Langisch <[email protected]> wrote:

> Still happens with HBase 0.90.5/Hadoop 1.0.0. But I think I have some more 
> insights on this topic. Following an up to date stack trace:
> 
> java.lang.NullPointerException
>        at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:986)
>        at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2008)
>        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>        at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> Caused by: java.lang.NullPointerException
>        at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
>        at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:2018)
>        at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2004)
>        ... 5 more
> 
> After checking the source code I've noticed that the value which is going to 
> be put into the HashMap can be null in the case where the waitForLock flag is 
> true or the rowLockWaitDuration is expired (HRegion#internalObtainRowLock, 
> line 2111ff). The latter I think happens in our case as we have heavy load 
> hitting the server.
> 
> IMHO this case should be handled somehow and must not lead to a NPE.
> 
> -
> Yves
> 
> On Dec 30, 2011, at 12:12 PM, Yves Langisch wrote:
> 
>> Still happens but before I'm going to add some debugging information I'll 
>> try to deploy the new version 0.90.5.
>> 
>> -
>> Yves
>> 
>> On Dec 18, 2011, at 12:08 AM, Stack wrote:
>> 
>>> On Fri, Dec 16, 2011 at 8:20 AM, Yves Langisch <[email protected]> wrote:
>>>> I'm using the async hbase client (1.0) and there is no way to choose a 
>>>> lockId on my own:
>>>> 
>>>> <snippet>
>>>> return database.client().lockRow(
>>>>                  new RowLockRequest(TableManager.ID_TABLE_NAME, 
>>>> MAXID_ROW)).join();
>>>> 
>>>> </snippet>
>>>> 
>>>> Any ideas what else could be wrong here?
>>>> 
>>> 
>>> Looking at the code on regionserver side,
>>> http://svn.apache.org/viewvc/hbase/tags/0.90.4/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java?view=markup,
>>> down around line 1994, its unlikely the region is null since we should
>>> throw NotServingRegionException if can't find region (and we check for
>>> null region name a few lines up) so maybe its something in the way we
>>> do obtainRowLock on line 1995?
>>> 
>>> Any chance of your instrumenting the regionserver?  Adding a bit of
>>> debugging and deploying the debugging regionserver?
>>> 
>>> My guess is we haven't seen this before because not many use rowlocks
>>> (rowlocks if long-lived and lots of contending clients could freeze
>>> you out of the server; each client blocked waiting on rowlock to clear
>>> occupies a handler of which there are a bounded number).
>>> 
>>> St.Ack
>>> 
>> 
> 

Reply via email to