[ 
https://issues.apache.org/jira/browse/HBASE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044355#comment-15044355
 ] 

Yu Li commented on HBASE-14004:
-------------------------------

bq. if we add a hsync logic, will there be degradation of performance
IMHO for 100% no-data-loss guarantee, we have to sacrifice performance, more or 
less. This is a trade-off and user should be able to make their choice. 
However, there's no real fsync support in HBase yet, although quite some 
efforts paid like HBASE-5954 ([~lhofhansl] and [~stack], please correct me if 
I've stated anything wrong here, thanks). Not sure whether I'm off the topic 
but somehow I feel all these things are related and indeed we are trying to fix 
a few fundamental issues here just like stack mentioned.

bq. Should we still allow users disable hsync just like before?
I think yes, we should leave an option here, user might care more about 
performance and would like to take the relative low all-3-DN-crashes risk, I 
guess.

bq. If so, what is the default configure?
I guess this depends on the final perf number of the new design and 
implementation

bq. If user disable hsync, what should we do for ReplicationSource?
I think we should go back to old logic when user choose to.

btw, could see quite some discussion here and maybe a doc summarizing the basic 
design, existing discussion conclusion and left-over questions would be good 
for understanding and further discussion? :-)

> [Replication] Inconsistency between Memstore and WAL may result in data in 
> remote cluster that is not in the origin
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14004
>                 URL: https://issues.apache.org/jira/browse/HBASE-14004
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: He Liangliang
>            Priority: Critical
>              Labels: replication, wal
>
> Looks like the current write path can cause inconsistency between 
> memstore/hfile and WAL which cause the slave cluster has more data than the 
> master cluster.
> The simplified write path looks like:
> 1. insert record into Memstore
> 2. write record to WAL
> 3. sync WAL
> 4. rollback Memstore if 3 fails
> It's possible that the HDFS sync RPC call fails, but the data is already  
> (may partially) transported to the DNs which finally get persisted. As a 
> result, the handler will rollback the Memstore and the later flushed HFile 
> will also skip this record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to