[
https://issues.apache.org/jira/browse/HADOOP-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakob Homan updated HADOOP-5467:
--------------------------------
Attachment: fsimageV19
fsimageV18
HADOOP-5467.patch
Done with addressing Konstantin's comments, tweaking, and adding more tests.
Ready for review.
Passes all tests.
{noformat}
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 11 new or
modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath
integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the
total number of release audit warnings.
{noformat}
I've done testing with large fsimage files here at Yahoo! and have had no
problems. The tool chugs through large fsimage files very quickly.
The only external changes since the last patch is better command-line
processing and all of the image processors output to file by default. Also, as
a result of this Console processor became Indented processor.
The attached fsimage files are used by the unit tests to verify the tool can
process fsimages generated by previous versions of Hadoop. They correspond to
Hadoop versions 18 and 19. The should be dropped into
{noformat}src/test/org/apache/hadoop/hdfs/tools/offlineImageViewer/{noformat}
Konstantin's comments (thanks for the thorough review!):
bq. 1. offlineimageviewer should be a part of hdfs shell command group rather
than hadoop.
done.
bq. 2. I would shorten it to just imageviewer.
I really think it's important emphasize the offline nature of the tool. How
about offlineimageviewer? It is a bit longer, but much more accurate. That
said, I for the bin/hdfs command I went with oiv. I think I there's precedent
here in abbreviating commands, such as distcp and fsck.
bq. 3. When I call hadoop offlineimageviewer it first prints an error: "Error
parsing options: i"
bq. 4. option "-o" does not work together with "-p XML". Please check other
combination too.
I've fixed the command-line processing. Not sure about the -o/-p, but should
be an issue now. Also better response to no input at all.
bq. 5. OfflineImageViewer.java warnings in line 135 about accessing static
methods in non static way.
Fixed. The command line parser has a rather odd implementation of the builder
pattern.
bq. 6. FSImageProcessorV16to19 should be renamed to something more version
independent.
Fixed. Renamed to FSImageLoaderCurrent. The FSImageLoader is a more
descriptive term in general.
bq. If you go to v -20 you will probably modify this class rather than
implement a new one.
Correct. Originally I had planned on having a separate class for each new
version in order to minimize the maintenance needed between releases and to
guarantee that each version could absolutely, correctly read its version.
However, since each increment of the layout version has generally only been an
addition of a field, it's just not worth it to have a complete, separate class.
Naming the class FSImageLoaderCurrent fixes this issue.
bq. I'd probably rename the interface FSImageProcessor into
FSImageProcessorInterface and then FSImageProcessorV16to19 can be renamed to
FSImageProcessor.
I really hate naming things FooInterface, since it doesn't, in the end, matter
if it's an interface, concrete or abstract class. Using Loader instead of
Processor relieves the word Processor of the multiple meanings it was
previously shouldering in the tool.
bq. 7. if ( p.canProcessVersion(version) ) should not have spaces after "("
and before ")".
Fixed.
bq. 8. TextWriterFSImageProcessor should probably be TextWriterFSImageVisitor.
Correct. Fixed.
bq. 9. FSImageElement should be declared in FSImageVisitor.
Done.
bq. 10. We do not want to use deprecated UTF8 class more than it is used
already, so it is better to use FSImage.readBytes(), etc. instead of
reimplementing them in FSImageProcessor.
Per our offline discussion, I made FSImage.readString() and FSImage.readBytes()
public until we move this into the fsimage package. Added a comment in FSImage
reminding us to undo that once the move is completed.
> Create an offline fsimage image viewer
> --------------------------------------
>
> Key: HADOOP-5467
> URL: https://issues.apache.org/jira/browse/HADOOP-5467
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: fsimage.xml, fsimageV18, fsimageV19, HADOOP-5467.patch,
> HADOOP-5467.patch, HADOOP-5467.patch, HADOOP-5467.patch
>
>
> It would be useful to have a tool to examine/dump the contents of the fsimage
> file to human-readable form. This would allow analysis of the namespace
> (file usage, block sizes, etc) without impacting the operation of the
> namenode. XML would be reasonable output format, as it can be easily viewed,
> compressed and manipulated via either XSLT or XQuery.
> I've started work on this and will have an initial version soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.