[jira] [Created] (HBASE-10882) Bulkload process hangs on regions randomly and finally throws RegionTooBusyException

Victor Xu (JIRA) Mon, 31 Mar 2014 19:58:28 -0700

Victor Xu created HBASE-10882:
---------------------------------

             Summary: Bulkload process hangs on regions randomly and finally 
throws RegionTooBusyException
                 Key: HBASE-10882
                 URL: https://issues.apache.org/jira/browse/HBASE-10882
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.94.10
         Environment: rhel 5.6, jdk1.7.0_45, hadoop-2.2.0-cdh5.0.0
            Reporter: Victor Xu

I came across the problem in the early morning several days ago. It happened
when I used hadoop completebulkload command to bulk load some hdfs files into
hbase table. Several regions hung and after retried three times they all threw
RegionTooBusyExceptions. Fortunately, I caught one of the exceptional region’s
HRegionServer process’s jstack info just in time.
I found that the bulkload process was waiting for a write lock:
at
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
The lock id is 0x00000004054ecbf0.
In the meantime, many other Get/Scan operations were also waiting for the same
lock id. And, of course, they were waiting for the read lock:
at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:873)
The most ridiculous thing is NO ONE OWNED THE LOCK! I searched the jstack
output carefully, but cannot find any process who claimed to own the lock.
When I restart the bulk load process, it failed at different regions but with
the same RegionTooBusyExceptions.
I guess maybe the region was doing some compactions at that time and owned the
lock, but I couldn’t find compaction info in the hbase-logs.
Finally, after several days’ hard work, the only temporary solution to this
problem was found, that is TRIGGERING A MAJOR COMPACTION BEFORE THE BULKLOAD,
So which process owned the lock? Has anyone came across the same problem before?

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HBASE-10882) Bulkload process hangs on regions randomly and finally throws RegionTooBusyException

Reply via email to