[ 
https://issues.apache.org/jira/browse/HDFS-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877059#comment-13877059
 ] 

Haohui Mai commented on HDFS-5722:
----------------------------------

HDFS-5793 does require random-access. Currently I picked the solution suggested 
by [~tlipcon]. The code attaches an uncompressed footer to describe the offsets 
and the length of each section, where each section is compressed individually.

It seems to me that it is a reasonable compromise -- the code cannot randomly 
access data within a section, but having the abilities to quickly navigate into 
different sections can cover the common use cases.

> Implement compression in the HTTP server of SNN / SBN instead of FSImage
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5722
>                 URL: https://issues.apache.org/jira/browse/HDFS-5722
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Haohui Mai
>
> The current FSImage format support compression, there is a field in the 
> header which specifies the compression codec used to compress the data in the 
> image. The main motivation was to reduce the number of bytes to be 
> transferred between SNN / SBN / NN.
> The main disadvantage, however, is that it requires the client to access the 
> FSImage in strictly sequential order. This might not fit well with the new 
> design of FSImage. For example, serializing the data in protobuf allows the 
> client to quickly skip data that it does not understand. The compression 
> built-in the format, however, complicates the calculation of offsets and 
> lengths. Recovering from a corrupted, compressed FSImage is also non-trivial 
> as off-the-shelf tools like bzip2recover is inapplicable.
> This jira proposes to move the compression from the format of the FSImage to 
> the transport layer, namely, the HTTP server of SNN / SBN. This design 
> simplifies the format of FSImage, opens up the opportunity to quickly 
> navigate through the FSImage, and eases the process of recovery. It also 
> retains the benefits of reducing the number of bytes to be transferred across 
> the wire since there are compression on the transport layer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to