[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15805983#comment-15805983 ] Hadoop QA commented on HDFS-6660: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-6660 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-6660 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12666461/HDFS-6660.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18080/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525026#comment-14525026 ] Hadoop QA commented on HDFS-6660: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12666461/HDFS-6660.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1a152c | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10696/console | This message was automatically generated. > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14524976#comment-14524976 ] Hadoop QA commented on HDFS-6660: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12666461/HDFS-6660.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1a152c | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10681/console | This message was automatically generated. > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128110#comment-14128110 ] Amir Langer commented on HDFS-6660: --- Re: [~daryn] Yes. It is an intermediate step but I disagree with some of your comments: bq. I believe this is an intermediate step, but by itself it appears to make things much worse. The storage object ref is bring replaced with a boxed int for indirection into a map. I challenge it being "much" worse. All it is, is an int hash lookup into a map instead of direct reference. One single lookup! Inside the Hadoop Namenode... this difference is negligible. And it is also absolutely necessary for the success of the whole idea. If we can't agree that index reference rather than pointers are ok to work with, then we might as well leave the whole thing. bq. The boxed int ref is just as large as the direct storage ref it's replacing. The map consumes a lot more memory too. The next sub-task turns this into a primitive int. Boxed integers are singletons and if you profile the Namenode, they are already there in memory without this patch. Therefore, using them is really for free so no memory added here. The map has a "fixed" size (number of Datanode storages) and does not depend on the number of blocks. In the next sub-task it allows us to remove a reference in every block so ultimately as the number of blocks increase, the size of the map becomes negligible as well. bq. I think this patch should be based on one of the other subtasks. We can agree that ultimately, all those sub-tasks serve the same purpose. I will create a single patch of the whole thing and put it in the umbrella Jira HDFS-6658 > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127142#comment-14127142 ] Daryn Sharp commented on HDFS-6660: --- I believe this is an intermediate step, but by itself it appears to make things much worse. The storage object ref is bring replaced with a boxed int for indirection into a map. The boxed int ref is just as large as the direct storage ref it's replacing. The map consumes a lot more memory too. I think this patch should be based on one of the other subtasks. Other than that, does the registration and add block logic need to be shifted between classes? If it's an issue of being "cleaner", can you please make the change on another jira to make this patch more succinct? (For future patches, please do not delete the prior patch. It's very common to have multiple revisions of a patch on a jira) > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121528#comment-14121528 ] Hadoop QA commented on HDFS-6660: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666461/HDFS-6660.patch against trunk revision 8f1a668. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7896//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7896//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7896//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7896//console This message is automatically generated. > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092874#comment-14092874 ] Daryn Sharp commented on HDFS-6660: --- As a safety precaution, is it possible to add at least an assert to help catch if the indices get munged? > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Attachments: > 0002-add-an-integer-id-to-all-storages-in-DatanodeManager.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.2#6252)