[ 
https://issues.apache.org/jira/browse/HDFS-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904434#comment-13904434
 ] 

Haohui Mai commented on HDFS-5952:
----------------------------------

Is it okay to use the XML-based tool for debugging? Otherwise you'll end up 
with duplicating the code in {{PBImageXmlWriter}} to parse the fsimage.

Note that the XML / delimited formats are intended to capture all internal 
details of the fsimage. I understand that the delimited format is more compact 
than the XML one. The delimited format does not include a schema thus it could 
be problematic when the format of fsimage changes. Unfortunately we changes the 
fsimage format quite often. :-(

If you really want to output in delimited format, I think it might be easier to 
take the output of {{PBImageXmlWriter}} and to use SAX to convert the XML into 
the delimited format. It should work fairly efficiently.

> Create a tool to run data analysis on the PB format fsimage
> -----------------------------------------------------------
>
>                 Key: HDFS-5952
>                 URL: https://issues.apache.org/jira/browse/HDFS-5952
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 3.0.0
>            Reporter: Akira AJISAKA
>            Assignee: Akira AJISAKA
>
> Delimited processor in OfflineImageViewer is not supported after HDFS-5698 
> was merged.
> The motivation of delimited processor is to run data analysis on the fsimage, 
> therefore, there might be more values to create a tool for Hive or Pig that 
> reads the PB format fsimage directly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to