[jira] [Commented] (HBASE-10882) Bulkload process hangs on regions randomly and finally throws RegionTooBusyException

Liang Xie (JIRA) Tue, 01 Apr 2014 04:19:35 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956398#comment-13956398
 ]


Liang Xie commented on HBASE-10882:
-----------------------------------

It would be better to ask at mail list in future:)
Why you have such many table pool instances ?
bq. "The most ridiculous thing is NO ONE OWNED THE LOCK! I searched the jstack 
output carefully, but cannot find any process who claimed to own the lock."
The thread dump doesn't show the lock holder while using Lock,  but you could 
see it if using Synchronized.:)

> Bulkload process hangs on regions randomly and finally throws 
> RegionTooBusyException
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-10882
>                 URL: https://issues.apache.org/jira/browse/HBASE-10882
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.94.10
>         Environment: rhel 5.6, jdk1.7.0_45, hadoop-2.2.0-cdh5.0.0
>            Reporter: Victor Xu
>         Attachments: jstack_5105.log
>
>
> I came across the problem in the early morning several days ago. It happened 
> when I used hadoop completebulkload command to bulk load some hdfs files into 
> hbase table. Several regions hung and after retried three times they all 
> threw RegionTooBusyExceptions. Fortunately, I caught one of the exceptional 
> region’s HRegionServer process’s jstack info just in time.
> I found that the bulkload process was waiting for a write lock:
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
> The lock id is 0x00000004054ecbf0.
> In the meantime, many other Get/Scan operations were also waiting for the 
> same lock id. And, of course, they were waiting for the read lock:
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:873)
> The most ridiculous thing is NO ONE OWNED THE LOCK! I searched the jstack 
> output carefully, but cannot find any process who claimed to own the lock.
> When I restart the bulk load process, it failed at different regions but with 
> the same RegionTooBusyExceptions. 
> I guess maybe the region was doing some compactions at that time and owned 
> the lock, but I couldn’t find compaction info in the hbase-logs.
> Finally, after several days’ hard work, the only temporary solution to this 
> problem was found, that is TRIGGERING A MAJOR COMPACTION BEFORE THE BULKLOAD, 
> So which process owned the lock? Has anyone came across the same problem 
> before?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10882) Bulkload process hangs on regions randomly and finally throws RegionTooBusyException

Reply via email to