[ 
https://issues.apache.org/jira/browse/HBASE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188273#comment-17188273
 ] 

Wellington Chevreuil edited comment on HBASE-24920 at 9/1/20, 8:49 AM:
-----------------------------------------------------------------------

{quote}completely re-implement needed pieces in hbase operator tool so that 
there are no dependencies on hbase version (making this tool hard to maintain).
{quote}
We've been doing this already in some parts of operator tools code (take a look 
at HBCKMetaTableAccessor and HBCKFsUtils, which are the operator tools dups of 
hbase MetaTableAccessor and FsUtils). If the efforts of porting/reimplement 
these private classes is not too complex, I would suggest to go with this 
option.


was (Author: wchevreuil):
{quotes}completely re-implement needed pieces in hbase operator tool so that 
there are no dependencies on hbase version (making this tool hard to maintain).
{quotes}
We've been doing this already in some parts of operator tools code (take a look 
at HBCKMetaTableAccessor and HBCKFsUtils, which are the operator tools dups of 
hbase MetaTableAccessor and FsUtils). If the efforts of porting/reimplement 
these private classes is not too complex, I would suggest to go with this 
option.

> A tool to rewrite corrupted HFiles
> ----------------------------------
>
>                 Key: HBASE-24920
>                 URL: https://issues.apache.org/jira/browse/HBASE-24920
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: hbase-operator-tools
>            Reporter: Andrey Elenskiy
>            Priority: Major
>
> Typically I have been dealing with corrupted HFiles (due to loss of hdfs 
> blocks) by just removing them. However, It always seemed wasteful to throw 
> away the entire HFile (which can be hundreds of gigabytes), just because one 
> hdfs block is missing (128MB).
> I think there's a possibility for a tool that can rewrite an HFile by 
> skipping corrupted blocks. 
> There can be multiple types of issues with hdfs blocks but any of them can be 
> treated as if the block doesn't exist:
> 1. All the replicas can be lost
> 2. The block can be corrupted due to some bug in hdfs (I've recently run into 
> HDFS-15186 by experimenting with EC).
> At the simplest the tool can be a local mapreduce job (mapper only) with a 
> custom HFile reader input that can seek to next DATABLK to skip corrupted 
> hdfs blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to