[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988538#comment-13988538
 ] 

Suresh Srinivas commented on HDFS-6293:
---------------------------------------

bq. I created two subtasks for PB and an HTTP interface.
Please make them related jiras.

bq.  have looked at the PB fsimage also, even with JSON we'd need to do similar 
things to avoid having a giant array
It depends on how you write it. In fact I am okay just writing in format that 
the old fsimage wrote. Complete directory path followed by file information. I 
am also okay writing it in the first cut a text file, since this is holding up 
rolling out release onto a cluster and valuable testing that come out of it.

bq. we may write out the fsimage, copy it over, and then fail while writing out 
the second listing. If edit logs get cleaned up in the meantime, we might have 
a gap between the listing and the start of the edit logs.
I think the use case you are talking about it certainly different. All that I 
have seen is, people want to be able to process namespace for reporting 
purposes. There is no guarantee that for every fsimage it should be created etc.

bq. I'm not sure how to interpret this. I just feel that Marcelo or myself 
could have shared this feedback more immediately over the higher-bandwidth 
medium of a phone, and we clearly had an interest in this JIRA since Marcelo's 
been commenting since the beginning. I'm not sure why you'd be offended that I 
asked that the rest of us be included in future phone calls.
You are overreacting. This is nothing different from conversations you have 
with your colleagues or customers. What is important is, the information 
relevant to others in the community is shared and they can participate in the 
discussion and get to comment. Which you have had an opportunity to do.



> Issues with OIV processing PB-based fsimages
> --------------------------------------------
>
>                 Key: HDFS-6293
>                 URL: https://issues.apache.org/jira/browse/HDFS-6293
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Kihwal Lee
>            Priority: Blocker
>         Attachments: Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to