[ 
https://issues.apache.org/jira/browse/HDFS-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898657#comment-13898657
 ] 

Haohui Mai commented on HDFS-5847:
----------------------------------

A few comments:

{code}
-    void serializeINodeDirectorySection(OutputStream out) throws IOException {
+    void serializeINodeDirectorySection(OutputStream out,
+        final List<INodeReference> refList) throws IOException {
{code}

It might be worthwhile to put the refList inside the SaverContext. It can avoid 
passing the arguments everywhere.


{code}
+              b.addRefChildren(refId++);
{code}

It seems to me that a more robust approach is to call 
{{b.addRefChildren(refList.size());}} instead.

{code}
+message INodeReferenceSection {
+  message INodeReference {
+    // id of the referred inode
+    optional uint64 referredId = 1;
+    // local name recorded in WithName
+    optional bytes name = 2;
+    // recorded in DstReference
+    optional uint32 dstSnapshotId = 3;
+    // recorded in WithName
+    optional uint32 lastSnapshotId = 4;
+  }
+  // repeated INodeReference...
+}
{code}

Do you think it is a good idea to record the number of {{INodeReference}} in 
the section? That way the loader can preallocate an array to improve the 
loading performance.

It seems that {{loadINodeReference}} is only used in the snapshot. It might be 
better to put the function into {{FSImageFormatPBSnapshot.Loader}}.


> Consolidate INodeReference into a separate section
> --------------------------------------------------
>
>                 Key: HDFS-5847
>                 URL: https://issues.apache.org/jira/browse/HDFS-5847
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0
>            Reporter: Haohui Mai
>            Assignee: Jing Zhao
>         Attachments: HDFS-5847.000.patch
>
>
> Currently each INodeDirectorySection.Entry contains variable numbers of 
> INodeReference entries. The INodeReference entries are inlined, therefore it 
> is difficult to quickly navigate through a INodeDirectorySection.Entry. 
> Skipping through a INodeDirectorySection.Entry without parsing is essential 
> to parse these entries in parallel.
> This jira proposes to consolidate INodeReferences into a section and give 
> each of them an ID. The INodeDirectorySection.Entry can store the list of the 
> IDs as a repeated field. That way we can leverage the existing code in 
> protobuf to quickly skip through a INodeDirectorySection.Entry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to