[jira] [Commented] (HBASE-14004) [Replication] Inconsistency between Memstore and WAL may result in data in remote cluster that is not in the origin

Duo Zhang (JIRA) Tue, 01 Nov 2016 05:40:14 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625326#comment-15625326
 ]


Duo Zhang commented on HBASE-14004:
-----------------------------------

I think we need to pick this up.

With AsyncFSWAL, it is not safe to use DFSInputStream to read the WAL file 
directly until EOF when it is still open. The data we read maybe disappear 
later. FSHLog also has this problem but it is much safer... See this document 
for more details

https://docs.google.com/document/d/11AyWtGhItQs6vsLRIx32PwTxmBY3libXwGXI25obVEY/edit#

The problem only happens when the WAL file is still open. AFAIK, if a RS is 
alive, then its WAL will always be replicated by itself. So I think it is 
possible that we expose an API to tell the ReplicationSource the safe length to 
read of an opened WAL file. And for a ReplicationSource that replicates WAL of 
other RS, then we can make sure the RS is dead and all its WALs should also be 
closed(we can also make sure it by calling recoverLease). So it is safe to read 
it until EOF with DFSInputStream.

Any concerns?  If not, Let's start working!

Thanks.

> [Replication] Inconsistency between Memstore and WAL may result in data in 
> remote cluster that is not in the origin
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14004
>                 URL: https://issues.apache.org/jira/browse/HBASE-14004
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: He Liangliang
>            Priority: Critical
>              Labels: replication, wal
>
> Looks like the current write path can cause inconsistency between 
> memstore/hfile and WAL which cause the slave cluster has more data than the 
> master cluster.
> The simplified write path looks like:
> 1. insert record into Memstore
> 2. write record to WAL
> 3. sync WAL
> 4. rollback Memstore if 3 fails
> It's possible that the HDFS sync RPC call fails, but the data is already  
> (may partially) transported to the DNs which finally get persisted. As a 
> result, the handler will rollback the Memstore and the later flushed HFile 
> will also skip this record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14004) [Replication] Inconsistency between Memstore and WAL may result in data in remote cluster that is not in the origin

Reply via email to