[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963364#comment-15963364
 ] 

Maddineni Sukumar commented on HBASE-16466:
-------------------------------------------

Hi [~zghaobac] , In this tool we are not taking snapshots. We are simply using 
existing snapshots given as input params to compare state at that time across 
peers. 
For example if you want to compare data upto 11AM then take snapshot in both 
clusters after 11AM and then run VerifyReplication using both snapshots. You 
can run this as many times as you want and get same result as snapshots wont 
change with live data. 

This is useful to reduce load on HBase while running on large tables and also 
useful to run same job multiple times to debug data mismatch issues due to 
Replication or something else. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16466
>                 URL: https://issues.apache.org/jira/browse/HBASE-16466
>             Project: HBase
>          Issue Type: Improvement
>          Components: hbase
>    Affects Versions: 0.98.21
>            Reporter: Sukumar Maddineni
>            Assignee: Maddineni Sukumar
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to