[
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393356#comment-17393356
]
Stephen O'Donnell commented on HDFS-16147:
------------------------------------------
On further review, most of what I wrote above is wrong!
When saving the image, there is a single output stream, but each section is
compressed within that stream, each as a separate compressed stream, eg:
{code}
OVERALL_STREAM
COMPRESSED_INODE_SECTION
COMPRESSED_DIR_SECTION
...
{code}
You can see this in the commitSection() method, where the stream is finished().
So this means that when we load a section (not in parallel), it jumps to the
start of a compressed section, and reads it in full.
This means it is still unknown how you can save a compressed image with
sub-sections and load it without parallel. Perhaps a compressed stream can read
embedded compressed streams within itself - I am not sure, but I would like to
understand how this is working.
> load fsimage with parallelization and compression
> -------------------------------------------------
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namanode
> Affects Versions: 3.3.0
> Reporter: liuyongpan
> Priority: Minor
> Attachments: HDFS-16147.001.patch, HDFS-16147.002.patch,
> subsection.svg
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]