[
https://issues.apache.org/jira/browse/HDFS-11881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang updated HDFS-11881:
-----------------------------------
Fix Version/s: 2.8.2
3.0.0-alpha4
2.9.0
> NameNode consumes a lot of memory for snapshot diff report generation
> ---------------------------------------------------------------------
>
> Key: HDFS-11881
> URL: https://issues.apache.org/jira/browse/HDFS-11881
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs, snapshots
> Affects Versions: 3.0.0-alpha1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: 1_ChunkedArrayList_SnapshotDiffReport.png,
> 2_ArrayList_SnapshotDiffReport.png, HDFS-11881.01.patch
>
>
> *Problem:*
> HDFS supports a snapshot diff tool which can generate a [detailed report |
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report]
> of modified, created, deleted and renamed files between any 2 snapshots.
> {noformat}
> hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
> {noformat}
> However, if the diff list between 2 snapshots happens to be huge, in the
> order of millions, then NameNode can consume a lot of memory while generating
> the huge diff report. In a few cases, we are seeing NameNode getting into a
> long GC lasting for few minutes to make room for this burst in memory
> requirement during snapshot diff report generation.
> *RootCause:*
> * NameNode tries to generate the diff report with all diff entries at once
> which puts undue pressure
> * Each diff report entry has the diff type (enum), source path byte array,
> and destination path byte array to the minimum. Let's take file deletions use
> case. For file deletions, there would be only source or destination paths in
> the diff report entry. Let's assume these deleted files on average take
> 128Bytes for the path. 4 million file deletion captured in diff report will
> thus need 512MB of memory
> * The snapshot diff report uses simple java ArrayList which tries to double
> its backing contiguous memory chunk every time the usage factor crosses the
> capacity threshold. So, a 512MB memory requirement might be internally asking
> for a much larger contiguous memory chunk
> *Proposal:*
> * Make NameNode snapshot diff report service follow the batch model (like
> directory listing service). Clients (hdfs snapshotDiff command) will then
> receive diff report in small batches, and need to iterate several times to
> get the full list.
> * Additionally, snap diff report service in the NameNode can make use of
> ChunkedArrayList data structure instead of the current ArrayList so as to
> avoid the curse of fragmentation and large contiguous memory requirement.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]