[ 
https://issues.apache.org/jira/browse/HBASE-26913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh Shah updated HBASE-26913:
---------------------------------
    Description: 
In our production clusters, we have seen cases where data is present in source 
cluster but not in the sink cluster and 1 case where data is present in sink 
cluster but not in source cluster. 

We have internal tools where we take incremental backup every day on both 
source and sink clusters and we compare the hash of the data in both the 
backups. We have seen many cases where hash doesn't match which means data is 
not consistent between source and sink for that given day. The Mean Time To 
Detect (MTTD) these inconsistencies is atleast 2 days and requires lot of 
manual debugging.

We need some tool where we can reduce MTTD and requires less manual debugging.

I have attached design doc. Huge thanks to [~bharathv]  to come up with this 
design at my work place.

  was:
{*}{*}In our production clusters, we have seen cases where data is present in 
source cluster but not in the sink cluster and 1 case where data is present in 
sink cluster but not in source cluster. 

We have internal tools where we take incremental backup every day on both 
source and sink clusters and we compare the hash of the data in both the 
backups. We have seen many cases where hash doesn't match which means data is 
not consistent between source and sink for that given day. The Mean Time To 
Detect (MTTD) these inconsistencies is atleast 2 days and requires lot of 
manual debugging.

We need some tool where we can reduce MTTD and requires less manual debugging.

I have attached design doc. Huge thanks to [~bharathv]  to come up with this 
design at my work place.


> Replication Observability Framework
> -----------------------------------
>
>                 Key: HBASE-26913
>                 URL: https://issues.apache.org/jira/browse/HBASE-26913
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Replication
>            Reporter: Rushabh Shah
>            Assignee: Rushabh Shah
>            Priority: Major
>
> In our production clusters, we have seen cases where data is present in 
> source cluster but not in the sink cluster and 1 case where data is present 
> in sink cluster but not in source cluster. 
> We have internal tools where we take incremental backup every day on both 
> source and sink clusters and we compare the hash of the data in both the 
> backups. We have seen many cases where hash doesn't match which means data is 
> not consistent between source and sink for that given day. The Mean Time To 
> Detect (MTTD) these inconsistencies is atleast 2 days and requires lot of 
> manual debugging.
> We need some tool where we can reduce MTTD and requires less manual debugging.
> I have attached design doc. Huge thanks to [~bharathv]  to come up with this 
> design at my work place.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to