[
https://issues.apache.org/jira/browse/HBASE-11715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098079#comment-14098079
]
Gomathivinayagam Muthuvinayagam commented on HBASE-11715:
---------------------------------------------------------
I am interested in working on this task. Merkle tree, we need to constantly to
run some background service, and it will require additional amount of data. Can
you provide more details, I can assign it to myself and work on this?
Thank you.
> HBase should provide a tool to compare 2 remote tables.
> -------------------------------------------------------
>
> Key: HBASE-11715
> URL: https://issues.apache.org/jira/browse/HBASE-11715
> Project: HBase
> Issue Type: New Feature
> Components: util
> Reporter: Jean-Marc Spaggiari
>
> As discussed in the mailing list, when a table is copied to another cluster
> and need to be validated against the first one, only VerifyReplication can be
> used. However, this can be very long since data need to be copied again.
> We should provide an easier and faster way to compare the tables.
> One option is to calculate hashs per ranges. User can define number of
> buckets, then we split the table into this number of buckets and calculate an
> hash for each (Like partitioner is already doing). We can also optionally
> calculate an overall CRC to reduce even more hash collision.
--
This message was sent by Atlassian JIRA
(v6.2#6252)