[
https://issues.apache.org/jira/browse/HDFS-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904434#comment-13904434
]
Haohui Mai commented on HDFS-5952:
----------------------------------
Is it okay to use the XML-based tool for debugging? Otherwise you'll end up
with duplicating the code in {{PBImageXmlWriter}} to parse the fsimage.
Note that the XML / delimited formats are intended to capture all internal
details of the fsimage. I understand that the delimited format is more compact
than the XML one. The delimited format does not include a schema thus it could
be problematic when the format of fsimage changes. Unfortunately we changes the
fsimage format quite often. :-(
If you really want to output in delimited format, I think it might be easier to
take the output of {{PBImageXmlWriter}} and to use SAX to convert the XML into
the delimited format. It should work fairly efficiently.
> Create a tool to run data analysis on the PB format fsimage
> -----------------------------------------------------------
>
> Key: HDFS-5952
> URL: https://issues.apache.org/jira/browse/HDFS-5952
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: tools
> Affects Versions: 3.0.0
> Reporter: Akira AJISAKA
> Assignee: Akira AJISAKA
>
> Delimited processor in OfflineImageViewer is not supported after HDFS-5698
> was merged.
> The motivation of delimited processor is to run data analysis on the fsimage,
> therefore, there might be more values to create a tool for Hive or Pig that
> reads the PB format fsimage directly.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)