[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966240#comment-15966240 ] Hadoop QA commented on HDFS-9011: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 14s{color} | {color:red} HDFS-9011 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-9011 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12754275/HDFS-9011.002.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19061/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15806187#comment-15806187 ] Hadoop QA commented on HDFS-9011: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HDFS-9011 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-9011 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12754275/HDFS-9011.002.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18091/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059359#comment-15059359 ] Ajith S commented on HDFS-9011: --- would cause this https://issues.apache.org/jira/browse/HDFS-8610 > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998041#comment-14998041 ] nijel commented on HDFS-9011: - looks like similar discussion happened in https://issues.apache.org/jira/browse/HDFS-8574. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955565#comment-14955565 ] Colin Patrick McCabe commented on HDFS-9011: It would be simpler for the admin to create two (or more) storages on the same drive, and it wouldn't involve any code modification by us. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954355#comment-14954355 ] Tsz Wo Nicholas Sze commented on HDFS-9011: --- Here is a new idea -- we may partition the block ID space so that datanodes can send multiple small full block reports for each partition. The partitions needs not be fixed. - When a full block report is larger than a threshold, the report is split into two reports, one for blocks with odd ID and one for blocks with even IDs. If these reports are still too large, split them into four reports with ID suffixes 00, 01, 10 and 11. The process continue until the reports are smaller than the threshold. Datanode sends each partitioned report with its suffix. - Since the block ID space is partitioned, Namenode can process each partitioned report without knowing the remaining partitioned reports. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744469#comment-14744469 ] Colin Patrick McCabe commented on HDFS-9011: You can just raise the maximum RPC size via {{ipc.maximum.data.length}}, as added in HADOOP-9676, right? It is true that processing such a large report will take a long time on the NameNode, but this patch does not address that problem either. I am very skeptical about adding more complexity to the full block report path, unless it can really address the main problem: the length of time which the NameNode holds the lock for when processing a long storage report. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744491#comment-14744491 ] Jing Zhao commented on HDFS-9011: - Thanks for the comments, Colin. I'm also leaning towards not fixing this by now especially considering the reportDiff issue is non-trivial. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735775#comment-14735775 ] Jing Zhao commented on HDFS-9011: - Thanks for the review, Nicholas and Yi! bq. for each partial report rpc, NN calls reportDiff(..) but reportDiff(..) assumes full block report. Yeah, this is a big issue here. The current reportDiff assumes the block report contains all the blocks in the storage thus removes all the blocks after the delimiter block. We can record the last block in the previous block report for the same storage as a cookie, but we cannot guarantee there is no block change happening during the two block report RPCs. For example, the cookie block may be deleted during the two reports. Thus looks like it is very hard to continue the reportDiff process across two FBR RPC, unless we link all the blocks for each storage in a specific order. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736017#comment-14736017 ] Tsz Wo Nicholas Sze commented on HDFS-9011: --- Another possible solution is to accumulate partial reports in NN. It seems fine since DN supposes to send all its partial reports at once. NN can store the partial report in the block report lease temporarily. The lease expiry time for partial reports can be very short, say 3 minutes. When NN receives a partial report, it stores it in the lease and renews the lease. When NN receives the last partial report, it processes the full report. When the lease expires, NN removes the accumulated partial reports and reject future partial reports with the same ID. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14733188#comment-14733188 ] Yi Liu commented on HDFS-9011: -- Thanks [~jingzhao] for working on this. Besides Nicholas' comments. *1.* In BlockPoolSlice {code} + private void saveReplicas(List persistList) { +if (persistList == null || persistList.isEmpty()) { return; } File tmpFile = new File(currentDir, REPLICA_CACHE_FILE + ".tmp"); @@ -787,7 +787,9 @@ private void saveReplicas(BlockListAsLongs blocksListToPersist) { FileOutputStream out = null; try { out = new FileOutputStream(tmpFile); - blocksListToPersist.writeTo(out); + for (BlockListAsLongs blockLists : persistList) { +blockLists.writeTo(out); + } {code} Now we write {{BlockListAsLongs}} *list* to {{REPLICA_CACHE_FILE}}, so we should also change the logic of {{readReplicasFromCache}}: {code} BlockListAsLongs blocksList = BlockListAsLongs.readFrom(inputStream); {code} It currently read the first {{BlockListAsLongs}}. Also in {{saveReplicas}}, if one BlockListAsLongs has 0 number of blocks, it's better not to persist it, otherwise there is NullPointerException while reading replicas from cache file. *2.* We should also change the description about {{dfs.blockreport.split.threshold}} in hdfs-default.xml Nits: some line are longer than 80 characters in the patch. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14733192#comment-14733192 ] Yi Liu commented on HDFS-9011: -- Please also add datanode restart in the {{TestSplitBlockReport}}, then it can cover tests for my comment #1. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730826#comment-14730826 ] Tsz Wo Nicholas Sze commented on HDFS-9011: --- Patch looks good in general. Just some questions: Should we enforce block report index order, i.e. context.getCurRpc() == indexInLastBlockReport + 1? Also, do we need to handle out of order block report index? One of the rpc may be dropped and it is re-sent later. The block report rpc's may arrive out of order. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731799#comment-14731799 ] Tsz Wo Nicholas Sze commented on HDFS-9011: --- It seems there is a bug: for each partial report rpc, NN calls reportDiff(..) but reportDiff(..) assumes full block report. I think the diff is incorrect for a partial report. In particular, the toRemove set may contain some blocks reported by other rpcs. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731635#comment-14731635 ] Hadoop QA commented on HDFS-9011: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 34s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 10 new or modified test files. | | {color:green}+1{color} | javac | 10m 1s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 12m 18s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 0s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 4s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 59s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 42s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 97m 37s | Tests failed in hadoop-hdfs. | | | | 149m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.tools.TestDFSHAAdminMiniCluster | | | hadoop.hdfs.tools.TestDFSAdminWithHA | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStartupVersions | | | org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12754275/HDFS-9011.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e1feaf6 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12318/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12318/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12318/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12318/console | This message was automatically generated. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730456#comment-14730456 ] Hadoop QA commented on HDFS-9011: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 1s | The patch appears to include 9 new or modified test files. | | {color:green}+1{color} | javac | 7m 49s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 59s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 45s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 4s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 12s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 212m 19s | Tests failed in hadoop-hdfs. | | | | 256m 57s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.server.mover.TestStorageMover | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | | | hadoop.hdfs.server.datanode.TestDnRespectsBlockReportSplitThreshold | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics | | | hadoop.hdfs.server.datanode.TestBlockReplacement | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | | | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks | | | hadoop.hdfs.server.namenode.TestCacheDirectives | | | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.blockmanagement.TestBlockReportRateLimiting | | | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.server.namenode.TestListCorruptFileBlocks | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestFsck | | | org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12754124/HDFS-9011.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c83d13c | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12301/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12301/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12301/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12301/console | This message was automatically generated. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729883#comment-14729883 ] Hadoop QA commented on HDFS-9011: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 43s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 8s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 23s | The applied patch generated 8 new checkstyle issues (total was 424, now 426). | | {color:red}-1{color} | whitespace | 0m 3s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 168m 7s | Tests failed in hadoop-hdfs. | | | | 213m 34s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.server.datanode.TestDnRespectsBlockReportSplitThreshold | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | | | hadoop.hdfs.TestParallelShortCircuitReadUnCached | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753882/HDFS-9011.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 53c38cc | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12290/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12290/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12290/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12290/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12290/console | This message was automatically generated. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)