[ 
https://issues.apache.org/jira/browse/HBASE-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694563#action_12694563
 ] 

stack commented on HBASE-7:
---------------------------

Other things:

+ Clean up dross in the filesystem.  An option on hbfsck would go through the 
filesystem and remove anything not referenced.  For example, the powerset 
instance of hbase has been migrated multiple times and there is a bunch of junk 
under /hbase.  I could go through manually deleting stuff but I'm afraid I'd 
remove something needed.  Tool could also look at regions and make sure they 
have a reference in .META.   If not, then they have been left over somehow and 
can be cleaned ("OK to remove region XYZ?").
+ Region names on filesystem are encoded.   An info file that said what actual 
name was and other attributes of region would help hbfsck put things back 
together again -- would also help debugging a hosed cluster.
+ (Stretch goal) If a data file is corrupt, rewrite it with as much of the 
original data as is possible to save (skip bad section).  Make smart decisions 
about edit sequence id, etc., if unreadable inferring from neighbors if 
necessary.
+ (Stretch goal) Have an --info mode where hbfsck dumps out stats on the 
content of the filesystem 

First cut, I'd imagine, the hbfsck would run through the filesystem looking at 
content of zk and .META. and try to report anomalies.  Next step after that 
would be effect repair.  Part of the hbfsck dev. would be figuring what you 
need in filesystem to do things like repair a broken .META. or to figure what 
content is dangling/unreferenced.

See how your tool is evolving.  As you write it, keep in mind that you might 
later want to host it inside a MR job -- especially if you ever need to read a 
fat hbase instance with lots of regions.  Also consider, as in fsck, that you 
might have a 'quick' mode (Just a thought).

It should probably run like fsck where it asks yes/no when its effecting repair 
(user should be able to pass flag which says 'yes' to all questions).

Documentation and clean integration with shell I'd imagine, would be key 
components of any hbasck tool since it'll only be needed rarely but when it is, 
the user in distress will want to be clear on how it all works.

Hope this helps.

> [hbase] Provide a HBase checker and repair tool similar to fsck
> ---------------------------------------------------------------
>
>                 Key: HBASE-7
>                 URL: https://issues.apache.org/jira/browse/HBASE-7
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: util
>            Reporter: Jim Kellerman
>             Fix For: 0.20.0
>
>         Attachments: HBASE-7.patch, patch.txt
>
>
> We need a tool to verify (and repair) HBase much like fsck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to