[ 
https://issues.apache.org/jira/browse/HADOOP-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688435#action_12688435
 ] 

Jakob Homan commented on HADOOP-5467:
-------------------------------------

bq. +1 it would be very helpful. Also, any thoughts on HADOOP-3717
It's worth looking at.  Repair is a bit more complicated than listing, but it 
could certainly be done.

bq. From the code, it appears that the tool parses the image file one record at 
a time instead of reading the entire file into memory. This is good, because 
this tool can run on a machine that has much less memory that the NameNode.
Yes, this was a design goal.  I've tested it against the biggest fsimage files 
I could find and, though it took a few minutes to chug through them, the tool 
had no problems.  Process memory usage is negligble.

bq. Is it possible to intelligently skip bad records? This will be useful in 
the case when a system administrator is trying to fix a broken fsimage.
It's something, like Lohit's comment, that we could look at.  While debugging I 
noticed that the failure scenario tends to be the ability to limp along until 
the tool reaches the long that stores the number of blocks.  In a corrupted 
file, this is read in as some gigantic number.  Because the actual block 
records are just three longs, the tool then happily reads longs, thinking it's 
reading block info, until it encounters EOF.  A useful heuristic may be to 
consider any number of blocks above say 10k to be erroneous and either bail or 
start looking for another record beginning.  Worth pursuing.

bq. Over time, it is possible that this could serve as a tool to manually 
repair the fsimage.
Definitely.  The tool is written in such a way that it would be reasonable to 
write processors that spit out new versions (making it an fsimage convertor) or 
could repair the records as they go.  It's designed with an eye towards 
extensibility.

> Create an offline fsimage image viewer
> --------------------------------------
>
>                 Key: HADOOP-5467
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5467
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: util
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: fsimage.xml, HADOOP-5467.patch, HADOOP-5467.patch
>
>
> It would be useful to have a tool to examine/dump the contents of the fsimage 
> file to human-readable form.  This would allow analysis of the namespace 
> (file usage, block sizes, etc) without impacting the operation of the 
> namenode.  XML would be reasonable output format, as it can be easily viewed, 
> compressed and manipulated via either XSLT or XQuery.  
> I've started work on this and will have an initial version soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to