[
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639905#comment-16639905
]
Lei Chen edited comment on HBASE-16423 at 10/5/18 2:50 PM:
-----------------------------------------------------------
I'm facing the false positive inconsistency problem you described here as well.
Having the thread sleep and compare again some time later looks like a good
way to reduce noises, but may not be a guaranteed way to report inconsistency.
As long as the ingestion is running, it is possible at the time of
re-comparing, the target row of source and replication have matched and
diverged again. A more sophisticated method may be required if user needs 100%
confidence.
was (Author: leochen4891):
I'm facing the false positive inconsistency problem you described here.
Having the thread sleep and compare again some time later looks like a good way
to reduce noises, but may not be a guaranteed way to report inconsistency. As
long as the ingestion is running, it is possible at the time of re-comparing,
the target row of source and replication have matched and diverged again. A
more sophisticated method may be required if user needs 100% confidence.
> Add re-compare option to VerifyReplication to avoid occasional inconsistent
> rows
> --------------------------------------------------------------------------------
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
> Issue Type: Improvement
> Components: Replication
> Affects Versions: 2.0.0
> Reporter: Jianwei Cui
> Assignee: Jianwei Cui
> Priority: Minor
> Fix For: 1.4.0, 2.0.0
>
> Attachments: HBASE-16423-branch-1-v1.patch, HBASE-16423-v1.patch,
> HBASE-16423-v2.patch, HBASE-16423-v3.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may
> report inconsistent rows if there are data being written to source or peer
> clusters during scanning. These occasionally inconsistent rows will have the
> same data if we do the comparison again after a short period. It is not easy
> to find the really inconsistent rows if VerifyReplication report a large
> number of such occasionally inconsistency. To avoid this case, we can add an
> option to make VerifyReplication read out the inconsistent rows again after
> sleeping a few seconds and re-compare the rows during scanning. This behavior
> follows the eventually consistency of hbase's replication. Suggestions and
> discussions are welcomed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)