[
https://issues.apache.org/jira/browse/HADOOP-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688435#action_12688435
]
Jakob Homan commented on HADOOP-5467:
-------------------------------------
bq. +1 it would be very helpful. Also, any thoughts on HADOOP-3717
It's worth looking at. Repair is a bit more complicated than listing, but it
could certainly be done.
bq. From the code, it appears that the tool parses the image file one record at
a time instead of reading the entire file into memory. This is good, because
this tool can run on a machine that has much less memory that the NameNode.
Yes, this was a design goal. I've tested it against the biggest fsimage files
I could find and, though it took a few minutes to chug through them, the tool
had no problems. Process memory usage is negligble.
bq. Is it possible to intelligently skip bad records? This will be useful in
the case when a system administrator is trying to fix a broken fsimage.
It's something, like Lohit's comment, that we could look at. While debugging I
noticed that the failure scenario tends to be the ability to limp along until
the tool reaches the long that stores the number of blocks. In a corrupted
file, this is read in as some gigantic number. Because the actual block
records are just three longs, the tool then happily reads longs, thinking it's
reading block info, until it encounters EOF. A useful heuristic may be to
consider any number of blocks above say 10k to be erroneous and either bail or
start looking for another record beginning. Worth pursuing.
bq. Over time, it is possible that this could serve as a tool to manually
repair the fsimage.
Definitely. The tool is written in such a way that it would be reasonable to
write processors that spit out new versions (making it an fsimage convertor) or
could repair the records as they go. It's designed with an eye towards
extensibility.
> Create an offline fsimage image viewer
> --------------------------------------
>
> Key: HADOOP-5467
> URL: https://issues.apache.org/jira/browse/HADOOP-5467
> Project: Hadoop Core
> Issue Type: New Feature
> Components: util
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: fsimage.xml, HADOOP-5467.patch, HADOOP-5467.patch
>
>
> It would be useful to have a tool to examine/dump the contents of the fsimage
> file to human-readable form. This would allow analysis of the namespace
> (file usage, block sizes, etc) without impacting the operation of the
> namenode. XML would be reasonable output format, as it can be easily viewed,
> compressed and manipulated via either XSLT or XQuery.
> I've started work on this and will have an initial version soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.