[
https://issues.apache.org/jira/browse/HDFS-11881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047367#comment-16047367
]
Hadoop QA commented on HDFS-11881:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m
0s{color} | {color:green} The patch appears to include 1 new or modified test
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m
2s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m
24s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}
0m 38s{color} | {color:orange} hadoop-hdfs-project: The patch generated 4 new +
79 unchanged - 0 fixed = 83 total (was 79) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m
11s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 56s{color}
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
25s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 30s{color} |
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
| | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
| Timed out junit tests |
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11881 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12872786/HDFS-11881.01.patch |
| Optional Tests | asflicense compile javac javadoc mvninstall mvnsite
unit findbugs checkstyle |
| uname | Linux 26f4e5d8ef81 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh
|
| git revision | trunk / b3d3ede |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle |
https://builds.apache.org/job/PreCommit-HDFS-Build/19887/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
|
| unit |
https://builds.apache.org/job/PreCommit-HDFS-Build/19887/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HDFS-Build/19887/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client
hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
| Console output |
https://builds.apache.org/job/PreCommit-HDFS-Build/19887/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
> NameNode consumes a lot of memory for snapshot diff report generation
> ---------------------------------------------------------------------
>
> Key: HDFS-11881
> URL: https://issues.apache.org/jira/browse/HDFS-11881
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs, snapshots
> Affects Versions: 3.0.0-alpha1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Attachments: HDFS-11881.01.patch
>
>
> *Problem:*
> HDFS supports a snapshot diff tool which can generate a [detailed report |
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report]
> of modified, created, deleted and renamed files between any 2 snapshots.
> {noformat}
> hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
> {noformat}
> However, if the diff list between 2 snapshots happens to be huge, in the
> order of millions, then NameNode can consume a lot of memory while generating
> the huge diff report. In a few cases, we are seeing NameNode getting into a
> long GC lasting for few minutes to make room for this burst in memory
> requirement during snapshot diff report generation.
> *RootCause:*
> * NameNode tries to generate the diff report with all diff entries at once
> which puts undue pressure
> * Each diff report entry has the diff type (enum), source path byte array,
> and destination path byte array to the minimum. Let's take file deletions use
> case. For file deletions, there would be only source or destination paths in
> the diff report entry. Let's assume these deleted files on average take
> 128Bytes for the path. 4 million file deletion captured in diff report will
> thus need 512MB of memory
> * The snapshot diff report uses simple java ArrayList which tries to double
> its backing contiguous memory chunk every time the usage factor crosses the
> capacity threshold. So, a 512MB memory requirement might be internally asking
> for a much larger contiguous memory chunk
> *Proposal:*
> * Make NameNode snapshot diff report service follow the batch model (like
> directory listing service). Clients (hdfs snapshotDiff command) will then
> receive diff report in small batches, and need to iterate several times to
> get the full list.
> * Additionally, snap diff report service in the NameNode can make use of
> ChunkedArrayList data structure instead of the current ArrayList so as to
> avoid the curse of fragmentation and large contiguous memory requirement.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]