[
https://issues.apache.org/jira/browse/HBASE-11715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148210#comment-15148210
]
cuixin commented on HBASE-11715:
--------------------------------
We cannot think the htable's value is same, when we found the Blooms is same.
And how about the case when user do not open the Bloomfilter.
> HBase should provide a tool to compare 2 remote tables.
> -------------------------------------------------------
>
> Key: HBASE-11715
> URL: https://issues.apache.org/jira/browse/HBASE-11715
> Project: HBase
> Issue Type: New Feature
> Components: util
> Reporter: Jean-Marc Spaggiari
>
> As discussed in the mailing list, when a table is copied to another cluster
> and need to be validated against the first one, only VerifyReplication can be
> used. However, this can be very long since data need to be copied again.
> We should provide an easier and faster way to compare the tables.
> One option is to calculate hashs per ranges. User can define number of
> buckets, then we split the table into this number of buckets and calculate an
> hash for each (Like partitioner is already doing). We can also optionally
> calculate an overall CRC to reduce even more hash collision.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)