[ 
https://issues.apache.org/jira/browse/HDFS-11881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11881:
--------------------------------------
    Attachment: 2_ArrayList_SnapshotDiffReport.png
                1_ChunkedArrayList_SnapshotDiffReport.png

[~jojochuang] / [~yzhangal],
  Wrote a test to have 500K files in the snaphot diff report and run the 
snapshot diff shell command for 100+ times to see how the heap gets fragmented 
and the FullGC frequencies. Attached heap graph for both ArrayList and 
ChunkedArrayList based implementations of SnapshotDiffReport. The ArrayList 
needs quite a frequent LongGC to clear up the heap and to make room for the new 
report. Whereas, ChunkedArrayList based SansphotDiffReport needed only less 
number of FullGCs for the same test. If we can scale this test to have 10G+ 
SnapshotDiffReport, then the differences in heap usages and FullGCs requirement 
for ArrayList based approach will be of order of magnitude higher compared to 
ChunkedArrayList.

  Tried to do similar ChunkedArrayList approach for DirDiff, but soon realized 
that DirDiff uses far more functionality in the diff list like add by index, 
remove by index, set by index etc. All these index based operations are 
currently not supported in ChunkedArrayList. So, will take up this bugger task 
in a separate jira. 

   Can you please review the patch v01 in the context of FileDiff improvements 
alone for SnapshotDiffReport usecase? 



> NameNode consumes a lot of memory for snapshot diff report generation
> ---------------------------------------------------------------------
>
>                 Key: HDFS-11881
>                 URL: https://issues.apache.org/jira/browse/HDFS-11881
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, snapshots
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: 1_ChunkedArrayList_SnapshotDiffReport.png, 
> 2_ArrayList_SnapshotDiffReport.png, HDFS-11881.01.patch
>
>
> *Problem:*
> HDFS supports a snapshot diff tool which can generate a [detailed report | 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report]
>  of modified, created, deleted and renamed files between any 2 snapshots.
> {noformat}
> hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
> {noformat}
> However, if the diff list between 2 snapshots happens to be huge, in the 
> order of millions, then NameNode can consume a lot of memory while generating 
> the huge diff report. In a few cases, we are seeing NameNode getting into a 
> long GC lasting for few minutes to make room for this burst in memory 
> requirement during snapshot diff report generation.
> *RootCause:*
> * NameNode tries to generate the diff report with all diff entries at once 
> which puts undue pressure 
> * Each diff report entry has the diff type (enum), source path byte array, 
> and destination path byte array to the minimum. Let's take file deletions use 
> case. For file deletions, there would be only source or destination paths in 
> the diff report entry. Let's assume these deleted files on average take 
> 128Bytes for the path. 4 million file deletion captured in diff report will 
> thus need 512MB of memory 
> * The snapshot diff report uses simple java ArrayList which tries to double 
> its backing contiguous memory chunk every time the usage factor crosses the 
> capacity threshold. So, a 512MB memory requirement might be internally asking 
> for a much larger contiguous memory chunk
> *Proposal:*
> * Make NameNode snapshot diff report service follow the batch model (like 
> directory listing service). Clients (hdfs snapshotDiff command) will then 
> receive  diff report in small batches, and need to iterate several times to 
> get the full list.
> * Additionally, snap diff report service in the NameNode can make use of 
> ChunkedArrayList data structure instead of the current ArrayList so as to 
> avoid the curse of fragmentation and large contiguous memory requirement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to