[ 
https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504870#comment-16504870
 ] 

Mike Drob commented on HBASE-20649:
-----------------------------------

I'm worried about the scalability of this approach.

Since we're reading headers and not content, then it should be consistent time 
spent per file, not related to the size of the file, right? Do you have a rough 
estimate on how long this takes? There is a big difference between a tool that 
takes minutes to run versus one that we think will take hours.

> Validate HFiles do not have PREFIX_TREE DataBlockEncoding
> ---------------------------------------------------------
>
>                 Key: HBASE-20649
>                 URL: https://issues.apache.org/jira/browse/HBASE-20649
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Peter Somogyi
>            Assignee: Peter Somogyi
>            Priority: Minor
>         Attachments: HBASE-20649.master.001.patch, 
> HBASE-20649.master.002.patch
>
>
> HBASE-20592 adds a tool to check column families on the cluster do not have 
> PREFIX_TREE encoding.
> Since it is possible that DataBlockEncoding was already changed but HFiles 
> are not rewritten yet we would need a tool that can verify the content of 
> hfiles in the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to