[
https://issues.apache.org/jira/browse/HBASE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065868#comment-13065868
]
ramkrishna.s.vasudevan commented on HBASE-4101:
-----------------------------------------------
Before submiting the patch I would like to tell my analysis why the date object
is the problem.
Memstoreflusher.flushOneForGlobalPressure() and
MemstoreFlusher.reclaimMemStoreMemory() are the
two api where the problem occurs.
As part of flushOneForGlobalPressure() flushRegions() gets called. Here
{noformat}
lock.lock();
{noformat}
is obtained.
Then the flow goes into server.compactSplitThread.requestCompaction().
Here the region is added into CompactionQueue.
In the PriorityCompactionQueue.addToRegionsInQueue() api we try to log the
newRequest.
This internally invokes the toString of the date object as in the log
{noformat}
public String toString() {
return "regionName=" + r.getRegionNameAsString() +
", priority=" + p + ", date=" + date;
{noformat}
Internally the date object uses ResourceBundle where in the endLoading() api
{noformat}
Thread me = Thread.currentThread();
assert (underConstruction.get(constKey) == me);
underConstruction.remove(constKey);
synchronized (me) {
me.notifyAll();
}
{noformat}
tries to get the current thread.(Here the MemStoreFlusher).
Now parallely the MemStoreFlusher.reclaimMemStoreMemory() is getting called
which itself is synchronized.
So the other thread has obtained the MemStoreFlusher lock and waits to obtain
the
{noformat}
if (isAboveHighWaterMark()) {
lock.lock();
{noformat}
Whereas The ResourceBundle waits to get the MemStoreFlusher Lock. So this is
leading to a deadlock condition.
> Regionserver Deadlock
> ---------------------
>
> Key: HBASE-4101
> URL: https://issues.apache.org/jira/browse/HBASE-4101
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.90.3
> Environment: CentOS 5.5, CDH3 u0 Hadoop, HBase 0.90.3
> Reporter: Matt Davies
> Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: jstack.txt
>
>
> We periodically see a situation where the regionserver process exists in the
> process list, zookeeper thread sends the keepalive so the master won't remove
> it from the active list, yet the regionserver will not serve data.
> Hadoop(cdh3u0), HBase 0.90.3 (Apache version), under load from an internal
> testing tool.
> Attached is the full JStack
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira