[
https://issues.apache.org/jira/browse/HBASE-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406932#comment-15406932
]
Enis Soztutar commented on HBASE-16138:
---------------------------------------
I was exemplifying a case where if we do "lazy registration" by having the
handler threads that is doing the mutations to wait for an RPC, we are
introducing the RPC-scheduling dependency which might be the source of the dead
lock. If all of the handlers are blocked on an outgoing RPC, and that outgoing
RPC is to the same server, then we cannot make progress.
I did not check the patch (and other previous changes), but if we are
introducing a blocking RPC in region open as well, then whether there maybe
deadlock or not will depend on whether the region is opened only in master or
can be opened in region servers as well. Region open thread pool is also a
fixed sized pool, so if the region open request for the replication table
region gets queued up after other region opens we will end up with deadlock
state. If replication table can only live in master this may be avoided I
think.
> Cannot open regions after non-graceful shutdown due to deadlock with
> Replication Table
> --------------------------------------------------------------------------------------
>
> Key: HBASE-16138
> URL: https://issues.apache.org/jira/browse/HBASE-16138
> Project: HBase
> Issue Type: Sub-task
> Components: Replication
> Reporter: Joseph
> Assignee: Joseph
> Priority: Critical
> Attachments: HBASE-16138.patch
>
>
> If we shutdown an entire HBase cluster and attempt to start it back up, we
> have to run the WAL pre-log roll that occurs before opening up a region. Yet
> this pre-log roll must record the new WAL inside of ReplicationQueues. This
> method call ends up blocking on
> TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the
> Replication Table is not up yet. And we cannot assign the Replication Table
> because we cannot open any regions. This ends up deadlocking the entire
> cluster whenever we lose Replication Table availability.
> There are a few options that we can do, but none of them seem very good:
> 1. Depend on Zookeeper-based Replication until the Replication Table becomes
> available
> 2. Have a separate WAL for System Tables that does not perform any
> replication (see discussion at HBASE-14623)
> Or just have a seperate WAL for non-replicated vs replicated
> regions
> 3. Record the WAL log in the ReplicationQueue asynchronously (don't block
> opening a region on this event), which could lead to inconsistent Replication
> state
> The stacktrace:
>
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.recordLog(ReplicationSourceManager.java:376)
>
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.preLogRoll(ReplicationSourceManager.java:348)
>
> org.apache.hadoop.hbase.replication.regionserver.Replication.preLogRoll(Replication.java:370)
>
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.tellListenersAboutPreLogRoll(FSHLog.java:637)
>
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:701)
>
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:600)
>
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:533)
>
> org.apache.hadoop.hbase.wal.DefaultWALProvider.getWAL(DefaultWALProvider.java:132)
>
> org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:186)
>
> org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:197)
> org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:240)
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:1883)
>
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:363)
>
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
>
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> Does anyone have any suggestions/ideas/feedback?
> Attached a review board at: https://reviews.apache.org/r/50546/
> It is still pretty rough, would just like some feedback on it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)