[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988321#comment-13988321
 ] 

Suresh Srinivas commented on HDFS-6293:
---------------------------------------

Here is the summary of a quick call I had with [~nroberts], [~kihwal], and 
[~wheat9].

Requirements for the tool:
- It should be able to print a consistent file system information. This rules 
out just doing ls -r from standby (lets assume standby support reads), a 
directory should not appear twice due to renames.
- The tool should print hierarchical namespace information to avoid having to 
use a process without a lot of memory to consume the information.

Here is the proposal:
- Add a flag (turned off by default) to print hierarchical namespace after 
checkpointing is complete in a configurable directory location
- This information will only be printed in the standby namenode
- Last configurable N number of such namespace information files will be 
retained

We did consider printing this information as protobuf. But printing large 
hierarchical information is not straightforward and takes time. In the interest 
of time, we will print this in Json or text (let me know what you think).

In future, we can have the output format of the tool configurable, possibly to 
protobuf. This tool can nicely develop in the future into including other stats 
related to namespace. [~kihwal], and [~wheat9], let me know if I got this right.

> Issues with OIV processing PB-based fsimages
> --------------------------------------------
>
>                 Key: HDFS-6293
>                 URL: https://issues.apache.org/jira/browse/HDFS-6293
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Kihwal Lee
>            Priority: Blocker
>         Attachments: Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to