[
https://issues.apache.org/jira/browse/HBASE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050376#comment-15050376
]
Heng Chen commented on HBASE-14004:
-----------------------------------
{quote}
2. WAL logger tries to re-write the buffered entries to a new WAL but new WAL
creation failed due to the same network failure, and returns fail to client
3. region got reassigned due to balance or hbase shell command
{quote}
when create new WAL and write buffered entries into it.
If failed in this procedure, we will tell client 'unacked hlushed' mutation
failed. when region reassigned (NOT RS crash) at this time, memstore will be
flushed into HFile , right? As current logic, when we split WAL, we will check
lastFlushedSeqId, so duplicate entries will be skipped. see HBASE-14949
comments. [~carp84]
> [Replication] Inconsistency between Memstore and WAL may result in data in
> remote cluster that is not in the origin
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-14004
> URL: https://issues.apache.org/jira/browse/HBASE-14004
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: He Liangliang
> Priority: Critical
> Labels: replication, wal
>
> Looks like the current write path can cause inconsistency between
> memstore/hfile and WAL which cause the slave cluster has more data than the
> master cluster.
> The simplified write path looks like:
> 1. insert record into Memstore
> 2. write record to WAL
> 3. sync WAL
> 4. rollback Memstore if 3 fails
> It's possible that the HDFS sync RPC call fails, but the data is already
> (may partially) transported to the DNs which finally get persisted. As a
> result, the handler will rollback the Memstore and the later flushed HFile
> will also skip this record.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)