[ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639905#comment-16639905
 ] 

Lei Chen edited comment on HBASE-16423 at 10/5/18 2:50 PM:
-----------------------------------------------------------

I'm facing the false positive inconsistency problem you described here as well.
 Having the thread sleep and compare again some time later looks like a good 
way to reduce noises, but may not be a guaranteed way to report inconsistency. 
As long as the ingestion is running, it is possible at the time of 
re-comparing, the target row of source and replication have matched and 
diverged again. A more sophisticated method may be required if user needs 100% 
confidence.


was (Author: leochen4891):
I'm facing the false positive inconsistency problem you described here.
Having the thread sleep and compare again some time later looks like a good way 
to reduce noises, but may not be a guaranteed way to report inconsistency. As 
long as the ingestion is running, it is possible at the time of re-comparing, 
the target row of source and replication have matched and diverged again. A 
more sophisticated method may be required if user needs 100% confidence.

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-16423
>                 URL: https://issues.apache.org/jira/browse/HBASE-16423
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 2.0.0
>            Reporter: Jianwei Cui
>            Assignee: Jianwei Cui
>            Priority: Minor
>             Fix For: 1.4.0, 2.0.0
>
>         Attachments: HBASE-16423-branch-1-v1.patch, HBASE-16423-v1.patch, 
> HBASE-16423-v2.patch, HBASE-16423-v3.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to