[ 
https://issues.apache.org/jira/browse/HBASE-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380283#comment-15380283
 ] 

Joseph edited comment on HBASE-16138 at 7/15/16 11:10 PM:
----------------------------------------------------------

After revisiting this problem with [~ashu210890] and [~mantonov], I realized 
that step #2 (Do not register WAL queues for Replication Table regions) in the 
original plan will be much harder than realized, because we do not pass down 
region info into WAL creation. So we cannot specifically avoid registering 
WAL's for non-replicated tables inside of Replication.prelogroll() and there is 
also the issue that we would have to register WAL after its creation if a 
replicated region is assigned it. I think there are 4 viable solutions at this 
point. 
1. Lazy registration of WAL's. Instead of relying on attaching Replication as a 
listener on WAL prelog roll, we would only register WAL's into the 
ReplicationTable when a replicated region attempts to write to its WAL. 
2. Separate WAL for the Replication Table. Like Meta's separate WAL, 
Replication Table should never be replicated so it is different from all other 
tables.
3. Separate WAL for System Tables. Most System Tables should not be replicated. 
And if a user wants to use Table Based Replication, they cannot replicate 
System Tables
4. Have a separate WAL for replicated regions. We would have to create a new 
WAL GroupingStrategy based on replication scope. This might also lead to some 
complexities when we add and remove replication on certain column families.

Does anyone have any thoughts or suggestions?



was (Author: vegetable26):
After revisiting this problem with [~ashu210890] and [~mantonov], I realized 
that step #2 (Do not register WAL queues for Replication Table regions) in the 
original plan will be much harder than realized, because we do not pass down 
region info into WAL creation. I think there are 4 viable solutions at this 
point. 
1. Lazy registration of WAL's. Instead of relying on attaching Replication as a 
listener on WAL prelog roll, we would only register WAL's into the 
ReplicationTable when a replicated region attempts to write to its WAL. 
2. Separate WAL for the Replication Table. Like Meta's separate WAL, 
Replication Table should never be replicated so it is different from all other 
tables.
3. Separate WAL for System Tables. Most System Tables should not be replicated. 
And if a user wants to use Table Based Replication, they cannot replicate 
System Tables
4. Have a separate WAL for replicated regions. We would have to create a new 
WAL GroupingStrategy based on replication scope. This might also lead to some 
complexities when we add and remove replication on certain column families.

Does anyone have any thoughts or suggestions?


> Cannot open regions after non-graceful shutdown due to deadlock with 
> Replication Table
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-16138
>                 URL: https://issues.apache.org/jira/browse/HBASE-16138
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication
>            Reporter: Joseph
>            Assignee: Joseph
>            Priority: Critical
>
> If we shutdown an entire HBase cluster and attempt to start it back up, we 
> have to run the WAL pre-log roll that occurs before opening up a region. Yet 
> this pre-log roll must record the new WAL inside of ReplicationQueues. This 
> method call ends up blocking on 
> TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the 
> Replication Table is not up yet. And we cannot assign the Replication Table 
> because we cannot open any regions. This ends up deadlocking the entire 
> cluster whenever we lose Replication Table availability. 
> There are a few options that we can do, but none of them seem very good:
> 1. Depend on Zookeeper-based Replication until the Replication Table becomes 
> available
> 2. Have a separate WAL for System Tables that does not perform any 
> replication (see discussion  at HBASE-14623)
>               Or just have a seperate WAL for non-replicated vs replicated 
> regions
> 3. Record the WAL log in the ReplicationQueue asynchronously (don't block 
> opening a region on this event), which could lead to inconsistent Replication 
> state
> The stacktrace:
>         
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.recordLog(ReplicationSourceManager.java:376)
>         
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.preLogRoll(ReplicationSourceManager.java:348)
>         
> org.apache.hadoop.hbase.replication.regionserver.Replication.preLogRoll(Replication.java:370)
>         
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.tellListenersAboutPreLogRoll(FSHLog.java:637)
>         
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:701)
>         
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:600)
>         
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:533)
>         
> org.apache.hadoop.hbase.wal.DefaultWALProvider.getWAL(DefaultWALProvider.java:132)
>         
> org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:186)
>         
> org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:197)
>         org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:240)
>         
> org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:1883)
>         
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:363)
>         
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
>         
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>         
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         java.lang.Thread.run(Thread.java:745)
> Does anyone have any suggestions/ideas/feedback?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to