[ https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986079#comment-15986079 ]
Maddineni Sukumar commented on HBASE-16466: ------------------------------------------- Ted, I have below perf numbers as of now. Will get numbers on load impact. I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16. Rows NORMAL WITH_SNAPSHOTS ------------------------------------------------------- 1m 1m16s 36s 10m 6m15s 1m13s 500m 5h20m30s 8m40s With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours using normal table scan approach. > HBase snapshots support in VerifyReplication tool to reduce load on live > HBase cluster with large tables > -------------------------------------------------------------------------------------------------------- > > Key: HBASE-16466 > URL: https://issues.apache.org/jira/browse/HBASE-16466 > Project: HBase > Issue Type: Improvement > Components: hbase > Affects Versions: 0.98.21 > Reporter: Sukumar Maddineni > Assignee: Maddineni Sukumar > Fix For: 1.3.1 > > Attachments: HBASE-16466.branch-1.3.001.patch > > > As of now VerifyReplicatin tool is running using normal HBase scanners. If > you want to run VerifyReplication multiple times on a production live > cluster with large tables then it creates extra load on HBase layer. So if we > implement snapshot based support then both in source and target we can read > data from snapshots which reduces load on HBase -- This message was sent by Atlassian JIRA (v6.3.15#6346)