[ 
https://issues.apache.org/jira/browse/HBASE-11715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148210#comment-15148210
 ] 

cuixin commented on HBASE-11715:
--------------------------------

We cannot think the htable's value is same, when we found the Blooms is same. 
And how about the case when user do not open the Bloomfilter.

> HBase should provide a tool to compare 2 remote tables.
> -------------------------------------------------------
>
>                 Key: HBASE-11715
>                 URL: https://issues.apache.org/jira/browse/HBASE-11715
>             Project: HBase
>          Issue Type: New Feature
>          Components: util
>            Reporter: Jean-Marc Spaggiari
>
> As discussed in the mailing list, when a table is copied to another cluster 
> and need to be validated against the first one, only VerifyReplication can be 
> used. However, this can be very long since data need to be copied again.
> We should provide an easier and faster way to compare the tables. 
> One option is to calculate hashs per ranges. User can define number of 
> buckets, then we split the table into this number of buckets and calculate an 
> hash for each (Like partitioner is already doing). We can also optionally 
> calculate an overall CRC to reduce even more hash collision. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to