[
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986079#comment-15986079
]
Maddineni Sukumar edited comment on HBASE-16466 at 4/27/17 6:41 AM:
--------------------------------------------------------------------
Ted, I have below perf numbers as of now. Will get numbers on load impact.
I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS
16.
Rows NormalApproach SnapshotsApproach
---------------------------------------------------------------------
1million 1min16sec 36sec
10million 6min15sec 1min13sec
500million 5hours20mins 8mins40secs
With snapshots I am able to complete VerifyReplication job in 8 minutes instead
of 5 hours using normal table scan approach.
was (Author: [email protected]):
Ted, I have below perf numbers as of now. Will get numbers on load impact.
I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS
16.
Rows Normal Approach Snapshots approach
---------------------------------------------------------------------
1million 1min16sec 36sec
10million 6min15sec 1min13sec
500million 5hours20mins 8mins40secs
With snapshots I am able to complete VerifyReplication job in 8 minutes instead
of 5 hours using normal table scan approach.
> HBase snapshots support in VerifyReplication tool to reduce load on live
> HBase cluster with large tables
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
> Issue Type: Improvement
> Components: hbase
> Affects Versions: 0.98.21
> Reporter: Sukumar Maddineni
> Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If
> you want to run VerifyReplication multiple times on a production live
> cluster with large tables then it creates extra load on HBase layer. So if we
> implement snapshot based support then both in source and target we can read
> data from snapshots which reduces load on HBase
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)