[jira] [Resolved] (HDFS-10171) Balancer should log config values
[ https://issues.apache.org/jira/browse/HDFS-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge resolved HDFS-10171. --- Resolution: Duplicate Fix Version/s: 2.8.0 {noformat} 2016-03-15 22:42:31,618 [Thread-0] INFO balancer.Balancer (Balancer.java:getLong(231)) - dfs.balancer.movedWinWidth = 2000 (default=540) 2016-03-15 22:42:31,618 [Thread-0] INFO balancer.Balancer (Balancer.java:getInt(240)) - dfs.balancer.moverThreads = 1000 (default=1000) 2016-03-15 22:42:31,618 [Thread-0] INFO balancer.Balancer (Balancer.java:getInt(240)) - dfs.balancer.dispatcherThreads = 200 (default=200) 2016-03-15 22:42:31,618 [Thread-0] INFO balancer.Balancer (Balancer.java:getInt(240)) - dfs.datanode.balance.max.concurrent.moves = 5 (default=5) 2016-03-15 22:42:31,618 [Thread-0] INFO balancer.Balancer (Balancer.java:getLong(231)) - dfs.balancer.getBlocks.size = 2147483648 {noformat} > Balancer should log config values > - > > Key: HDFS-10171 > URL: https://issues.apache.org/jira/browse/HDFS-10171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.2 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > To improve supportability, Balancer should log config values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9262) Support reconfiguring dfs.datanode.lazywriter.interval.sec without DN restart
[ https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-9262: Attachment: HDFS-9262-HDFS-9000.005.patch V005 is rebased on trunk. > Support reconfiguring dfs.datanode.lazywriter.interval.sec without DN restart > - > > Key: HDFS-9262 > URL: https://issues.apache.org/jira/browse/HDFS-9262 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.7.0 >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9262-HDFS-9000.002.patch, > HDFS-9262-HDFS-9000.003.patch, HDFS-9262-HDFS-9000.004.patch, > HDFS-9262-HDFS-9000.005.patch, HDFS-9262.001.patch > > > This is to reconfigure > {code} > dfs.datanode.lazywriter.interval.sec > {code} > without restarting DN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]
[ https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196786#comment-15196786 ] Zhe Zhang commented on HDFS-9857: - Thanks Rakesh. +1 pending Jenkins. Nice work here! > Erasure Coding: Rename replication-based names in BlockManager to more > generic [part-1] > --- > > Key: HDFS-9857 > URL: https://issues.apache.org/jira/browse/HDFS-9857 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch > > > The idea of this jira is to rename the following entities in BlockManager as, > - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}} > - {{neededReplications}} to {{neededReconstruction}} > - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}} > Thanks [~zhz], [~andrew.wang] for the useful > [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10171) Balancer should log config values
[ https://issues.apache.org/jira/browse/HDFS-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10171: -- Description: To improve supportability, Balancer should log config values. (was: To improve supportability, Balancer should log config values and iteration termination reasons. * In {{Dispatcher$Dispatcher}}, log all parameters. * In {{Dispatcher$dispatchBlocks}}, log termination reasons.) Summary: Balancer should log config values (was: Balancer should log config values and iteration termination reasons) > Balancer should log config values > - > > Key: HDFS-10171 > URL: https://issues.apache.org/jira/browse/HDFS-10171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.2 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > > To improve supportability, Balancer should log config values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10171) Balancer should log config values and iteration termination reasons
John Zhuge created HDFS-10171: - Summary: Balancer should log config values and iteration termination reasons Key: HDFS-10171 URL: https://issues.apache.org/jira/browse/HDFS-10171 Project: Hadoop HDFS Issue Type: Improvement Components: balancer & mover Affects Versions: 2.7.2 Reporter: John Zhuge Assignee: John Zhuge Priority: Minor To improve supportability, Balancer should log config values and iteration termination reasons. * In {{Dispatcher$Dispatcher}}, log all parameters. * In {{Dispatcher$dispatchBlocks}}, log termination reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196761#comment-15196761 ] Lin Yiqun commented on HDFS-9904: - Thanks [~kihwal] for commit! > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee >Assignee: Lin Yiqun > Fix For: 2.7.3 > > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]
[ https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196759#comment-15196759 ] Rakesh R commented on HDFS-9857: Thanks [~zhz] for the reviews. I've attached patch addressing the comments. Following are the changes done compare to the previous patch: # Fixed 1st review comment. # Modified variable {{underReplicatedBlocksCount}} to {{lowRedundancyBlocksCount}} {code} - private volatile long underReplicatedBlocksCount = 0L; + private volatile long lowRedundancyBlocksCount = 0L; {code} # Modified {{neededReplications}} to {{neededReconstruction}}, {{under-replicated}} to {{low redundancy}} in logs/comments # Modified BlockManager method {{#processExtraRedundancyBlocksOnReCommission}} to {{#processExtraRedundancyBlocksOnReCommission}} # Few changes done in TestReplicationPolicy.java - {{ChooseUnderReplicatedBlocks}} to {{ChooseLowRedundancyBlocks}} and modified comments. bq. fileReplication should be renamed. We can take care of it when we rename getExpectedReplicaNum Since the current patch contains many changes, how about addressing couple of other items including this through other sub-task? > Erasure Coding: Rename replication-based names in BlockManager to more > generic [part-1] > --- > > Key: HDFS-9857 > URL: https://issues.apache.org/jira/browse/HDFS-9857 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch > > > The idea of this jira is to rename the following entities in BlockManager as, > - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}} > - {{neededReplications}} to {{neededReconstruction}} > - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}} > Thanks [~zhz], [~andrew.wang] for the useful > [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]
[ https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-9857: --- Attachment: HDFS-9857-02.patch > Erasure Coding: Rename replication-based names in BlockManager to more > generic [part-1] > --- > > Key: HDFS-9857 > URL: https://issues.apache.org/jira/browse/HDFS-9857 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch > > > The idea of this jira is to rename the following entities in BlockManager as, > - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}} > - {{neededReplications}} to {{neededReconstruction}} > - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}} > Thanks [~zhz], [~andrew.wang] for the useful > [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9960) OzoneHandler : Add localstorage support for keys
[ https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196745#comment-15196745 ] Chris Nauroth commented on HDFS-9960: - Hi [~anu]. It looks like these Checkstyle and Findbugs warnings are potentially relevant. {{Hashtable}} is generally not used in favor of {{HashMap}} or {{ConcurrentHashMap}}, because {{Hashtable}} uses some coarse-grained locking that doesn't perform as well as the others. Use of the platform default encoding is discouraged, because it can cause unpredictable behavior when code starts running on a system with an unexpected default encoding. We generally try to stick to UTF-8 everywhere. Could you please take a look? > OzoneHandler : Add localstorage support for keys > > > Key: HDFS-9960 > URL: https://issues.apache.org/jira/browse/HDFS-9960 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-7240 > > Attachments: HDFS-9960-HDFS-7240.001.patch > > > Adds local storage handler support for keys. This allows all REST api's to be > exercised via MiniDFScluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9961) Ozone: Add buckets commands to CLI
[ https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-9961: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1 for the patch. Checkstyle warnings are not actionable, and test failures are not related. I have committed this to the HDFS-7240 feature branch. [~anu], thank you. > Ozone: Add buckets commands to CLI > -- > > Key: HDFS-9961 > URL: https://issues.apache.org/jira/browse/HDFS-9961 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-7240 > > Attachments: HDFS-9961-HDFS-7240.001.patch > > > Add command for buckets to ozone CLI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9952) Expose FSNamesystem lock wait time as metrics
[ https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-9952: Attachment: HDFS-9952-01.patch Attaching the patch. > Expose FSNamesystem lock wait time as metrics > - > > Key: HDFS-9952 > URL: https://issues.apache.org/jira/browse/HDFS-9952 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-9952-01.patch > > > Expose FSNameSystem's readlock() and writeLock() wait time as metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9952) Expose FSNamesystem lock wait time as metrics
[ https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-9952: Status: Patch Available (was: Open) > Expose FSNamesystem lock wait time as metrics > - > > Key: HDFS-9952 > URL: https://issues.apache.org/jira/browse/HDFS-9952 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-9952-01.patch > > > Expose FSNameSystem's readlock() and writeLock() wait time as metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode
[ https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196729#comment-15196729 ] Hadoop QA commented on HDFS-9959: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 0s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 50s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 233m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem ||
[jira] [Updated] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9847: Attachment: HDFS-9847.004.patch That's a good idea. Update the latest patch for addressing comments. > HDFS configuration without time unit name should accept friendly time units > --- > > Key: HDFS-9847 > URL: https://issues.apache.org/jira/browse/HDFS-9847 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9847.001.patch, HDFS-9847.002.patch, > HDFS-9847.003.patch, HDFS-9847.004.patch, timeduration-w-y.patch > > > In HDFS-9821, it talks about the issue of leting existing keys use friendly > units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names > contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make > some other configurations which without time unit name to accept friendly > time units. The time unit {{seconds}} is frequently used in hdfs. We can > updating this configurations first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196712#comment-15196712 ] John Zhuge commented on HDFS-9940: -- I like 2) better because all rebalance configuration is done at Balancer. We need to design a solution to support HDFS-7466 as well. > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.
[ https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196709#comment-15196709 ] Brahma Reddy Battula commented on HDFS-9917: I meant to say,we can avoid RPC to namenode and unnecessary GC for these IBR's.. > IBR accumulate more objects when SNN was down for sometime. > --- > > Key: HDFS-9917 > URL: https://issues.apache.org/jira/browse/HDFS-9917 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > SNN was down for sometime because of some reasons..After restarting SNN,it > became unreponsive because > - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), > where as each datanode had only ~2.5 million blocks. > - GC can't trigger on this objects since all will be under RPC queue. > To recover this( to clear this objects) ,restarted all the DN's one by > one..This issue happened in 2.4.1 where split of blockreport was not > available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196705#comment-15196705 ] John Zhuge commented on HDFS-9940: -- [~yzhangal], great idea if we don't have to manually config each DN for rebalance. Since we already have {{hdfs dfsadmin -setBalancerBandwidth}}, we have 2 choices. 1) Add {{hdfs dfsadmin -setBalancerConcurrentMoves}} and {{hdfs dfsadmin -getBalancerConcurrentMoves}} 2) Balancer automatically calls Namenode API {{setBalancerBandwidth}} and newly added {{setBalancerConcurrentMoves}} based on config values (or even command options). Obsolete {{hdfs dfsadmin -setBalancerBandwidth}} and {{hdfs dfsadmin -getBalancerBandwidth}}. > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196701#comment-15196701 ] Hadoop QA commented on HDFS-9349: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 38s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 41s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 201m 4s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793658/HDFS-9349-HDFS-9000.005.patch | | JIRA Issue | HDFS-9349 | | Optional Tests | asflicense compile javac
[jira] [Updated] (HDFS-8901) Use ByteBuffer in striping positional read
[ https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-8901: Attachment: HDFS-8901-v5.patch Rebased. > Use ByteBuffer in striping positional read > -- > > Key: HDFS-8901 > URL: https://issues.apache.org/jira/browse/HDFS-8901 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HDFS-8901-v2.patch, HDFS-8901-v3.patch, > HDFS-8901-v4.patch, HDFS-8901-v5.patch, initial-poc.patch > > > Native erasure coder prefers to direct ByteBuffer for performance > consideration. To prepare for it, this change uses ByteBuffer through the > codes in implementing striping position read. It will also fix avoiding > unnecessary data copying between striping read chunk buffers and decode input > buffers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.
[ https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196635#comment-15196635 ] Brahma Reddy Battula commented on HDFS-9917: bq. I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem? Yes, I think its same as clearing on reRegister() at datanode itself. Advantage of clearing on reRegister() in DN itself, is unnecessary RPC will go to namenode and Namenode need to unnecessary GC for these IBR's.. We may also need to limit the DN keep accumulating the IBRs and use lot of memory > IBR accumulate more objects when SNN was down for sometime. > --- > > Key: HDFS-9917 > URL: https://issues.apache.org/jira/browse/HDFS-9917 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > SNN was down for sometime because of some reasons..After restarting SNN,it > became unreponsive because > - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), > where as each datanode had only ~2.5 million blocks. > - GC can't trigger on this objects since all will be under RPC queue. > To recover this( to clear this objects) ,restarted all the DN's one by > one..This issue happened in 2.4.1 where split of blockreport was not > available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy
[ https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-8905: Attachment: HDFS-8905-v9.patch Rebased one more time. > Refactor DFSInputStream#ReaderStrategy > -- > > Key: HDFS-8905 > URL: https://issues.apache.org/jira/browse/HDFS-8905 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch, > HDFS-8905-v3.patch, HDFS-8905-v4.patch, HDFS-8905-v5.patch, > HDFS-8905-v6.patch, HDFS-8905-v7.patch, HDFS-8905-v8.patch, HDFS-8905-v9.patch > > > DFSInputStream#ReaderStrategy family don't look very good. This refactors a > little bit to make them make more sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196630#comment-15196630 ] Vinayakumar B commented on HDFS-9260: - How about bringing this into branch-2? > Improve the performance and GC friendliness of NameNode startup and full > block reports > -- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Fix For: 3.0.0 > > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, > HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196566#comment-15196566 ] Yongjun Zhang commented on HDFS-9940: - Hi Guys, I wonder if this can be a balancer config only, and we don't need to set at datanode side. That is, when balancer starts, it reads this config, and it tells NN about this config, then NN can tell each datanode about this config as a piggyback of heartbeat response. This is similar like how {{final static int DNA_BALANCERBANDWIDTHUPDATE = 8; // update balancer bandwidth}} works. If what I'm proposing here works, then we can just use {{dfs.balancer.max.concurrent.moves}}, and user doesn't need to set DN. Thanks. > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS
[ https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-9954: --- Resolution: Invalid Status: Resolved (was: Patch Available) It turned out that creating HDFS issue and attaching patch does not invoke HDFS tests. test-patch runs tests based on the contents of the patch. > Test RPC timeout fix of HADOOP-12672 against HDFS > - > > Key: HDFS-9954 > URL: https://issues.apache.org/jira/browse/HDFS-9954 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki > Labels: test > Attachments: HDFS-9954.006.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]
[ https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196506#comment-15196506 ] Zhe Zhang commented on HDFS-9857: - Seems the patch needs a small rebase. I also found a nit and a follow-on task. # Should be {{blocksToReconstruct}}? {code} blocksToReplicate = neededReconstruction .chooseLowRedundancyBlocks(blocksToProcess); {code} # {{fileReplication}} should be renamed. We can take care of it when we rename {{getExpectedReplicaNum}}. {code} short fileReplication = getExpectedReplicaNum(storedBlock); {code} +1 after addressing. Thanks Rakesh for the work! > Erasure Coding: Rename replication-based names in BlockManager to more > generic [part-1] > --- > > Key: HDFS-9857 > URL: https://issues.apache.org/jira/browse/HDFS-9857 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9857-001.patch > > > The idea of this jira is to rename the following entities in BlockManager as, > - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}} > - {{neededReplications}} to {{neededReconstruction}} > - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}} > Thanks [~zhz], [~andrew.wang] for the useful > [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196503#comment-15196503 ] Arpit Agarwal commented on HDFS-9349: - Thank you for updating the patch [~xiaobingo]. The synchronization needs more work as the caller of {{getProtectedDirectories}} assumes the set will not be modified. Modifications to {{protectedDirectories}} will be rare so let's just make it a volatile reference and {{setProtectedDirectories}} can replace the reference atomically with a newly constructed set. Also you can use {{parseProtectedDirectories}} to construct the new set. > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, > HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196482#comment-15196482 ] John Zhuge commented on HDFS-9940: -- Balancer uses config {{dfs.datanode.balance.max.concurrent.moves}} to set field {{Dispatcher$maxConcurrentMovesPerNode}} that sets the size of the thread pool {{moveExecutor}} local to {{Dispatcher$executePendingMove}}. * If the value is higher than {{dfs.datanode.balance.max.concurrent.moves}} on the Datanode, Balancer may send more requests than DN can handle; DN will log "Not able to copy block ... because threads quota is exceeded" and return ERROR to Balancer. Thus some Balancer threads are wasted. * if the value is smaller, the potential of the DN is not reached. I can understand the original author's decision to use the same config name. How about {{dfs.balancer.max.concurrent.moves.per.datanode}}? > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8210) Ozone: Implement storage container manager
[ https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth reassigned HDFS-8210: --- Assignee: Chris Nauroth (was: Jitendra Nath Pandey) > Ozone: Implement storage container manager > --- > > Key: HDFS-8210 > URL: https://issues.apache.org/jira/browse/HDFS-8210 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Jitendra Nath Pandey >Assignee: Chris Nauroth > Attachments: HDFS-8210-HDFS-7240.1.patch, > HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, > HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch > > > The storage container manager collects datanode heartbeats, manages > replication and exposes API to lookup containers. This jira implements > storage container manager by re-using the block manager implementation in > namenode. This jira provides initial implementation that works with > datanodes. The additional protocols will be added in subsequent jiras. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER
[ https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196443#comment-15196443 ] Allen Wittenauer commented on HDFS-9956: is a naming services caching daemon being used or is this just a raw LDAP connection? > LDAP PERFORMANCE ISSUE AND FAIL OVER > > > Key: HDFS-9956 > URL: https://issues.apache.org/jira/browse/HDFS-9956 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: sanjay kenganahalli vamanna > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory and making the namenode to failover. > Instead of failover, we can use the > parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to > time-out and send the failed response back to the user so that we can prevent > name node failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-9349: Attachment: HDFS-9349-HDFS-9000.005.patch > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, > HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196437#comment-15196437 ] Xiaobing Zhou commented on HDFS-9349: - V005 is rebased on trunk. > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196389#comment-15196389 ] Tsz Wo Nicholas Sze commented on HDFS-9940: --- dfs.balancer.max.concurrent.moves is also confusing since "max.concurrent.moves" is per datanode. Ignoring incompatibility for a moment, what are the best names for these two properties? > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196374#comment-15196374 ] Arpit Agarwal commented on HDFS-9668: - bq. It means one slow operation of finalizeBlock, addBlock and createRbw in a slow storage can block all the other same operations in the same DataNode, especially in HBase when many wal/flusher/compactor are configured. Detecting slow disks is a known problem for DataNodes. If this problem does not manifest in regular operation perhaps we should try to add slow disk detection instead. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread > t@93335 >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140) > - locked <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > {noformat} > We measured the execution of some operations in FsDatasetImpl during the > test. Here following is the result. > !execution_time.png! > The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy > load take a really long time. > It means one slow operation of finalizeBlock, addBlock and createRbw in a > slow storage can block all the other same operations in the same DataNode, > especially in HBase when many wal/flusher/compactor are configured. > We need a finer grained lock mechanism in a new FsDatasetImpl implementation > and users can choose the implementation by configuring > "dfs.datanode.fsdataset.factory" in DataNode. > We can implement the lock by
[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196363#comment-15196363 ] John Zhuge commented on HDFS-9940: -- +1 HDFS-7466 > Rename dfs.balancer.max.concurrent.moves to avoid confusion > --- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers reassigned HDFS-7285: Assignee: Zhe Zhang (was: Matt Hardy) > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Fix For: 3.0.0 > > Attachments: Compare-consolidated-20150824.diff, > Consolidated-20150707.patch, Consolidated-20150806.patch, > Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, > HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > HDFSErasureCodingSystemTestPlan-20150824.pdf, > HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Hardy reassigned HDFS-7285: Assignee: Matt Hardy (was: Zhe Zhang) > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Matt Hardy > Fix For: 3.0.0 > > Attachments: Compare-consolidated-20150824.diff, > Consolidated-20150707.patch, Consolidated-20150806.patch, > Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, > HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > HDFSErasureCodingSystemTestPlan-20150824.pdf, > HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9959) add log when block removed from last live datanode
[ https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yunjiong zhao updated HDFS-9959: Attachment: HDFS-9959.1.patch Update patch: 1. log after release the write lock 2. change error to info. > add log when block removed from last live datanode > -- > > Key: HDFS-9959 > URL: https://issues.apache.org/jira/browse/HDFS-9959 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: yunjiong zhao >Assignee: yunjiong zhao >Priority: Minor > Attachments: HDFS-9959.1.patch, HDFS-9959.patch > > > Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last > datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help > to identify which datanode should be fixed first to recover missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196237#comment-15196237 ] Tsz Wo Nicholas Sze commented on HDFS-9668: --- > Currently, read operations have no special advantage over write operations. > Using a reader/writer lock changes that. ... [~cmccabe], good point. We probably should use a fair lock. Or we could add a conf similar to dfs.namenode.fslock.fair. [~jingcheng...@intel.com], thanks for working on this. The idea sounds good. One issue concern me about this is: how could we make sure that the synchronization is correct, especially outside the class? The current patch only changes FsDatasetImpl but the FsDatasetImpl object is also synchronized in other classes such as FsVolumeImpl. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread > t@93335 >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140) > - locked <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > {noformat} > We measured the execution of some operations in FsDatasetImpl during the > test. Here following is the result. > !execution_time.png! > The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy > load take a really long time. > It means one slow operation of finalizeBlock, addBlock and createRbw in a > slow storage can block all the other same operations in the same DataNode, > especially in HBase when many wal/flusher/compactor are configured. >
[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.
[ https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196158#comment-15196158 ] Tsz Wo Nicholas Sze commented on HDFS-9917: --- > Before Full BR, all pending IBRs will be flushed. ... Yes, this is the current problem. I suggest that NN could just ignore the pending IBRs before the first full BR. Would it fix the problem? > IBR accumulate more objects when SNN was down for sometime. > --- > > Key: HDFS-9917 > URL: https://issues.apache.org/jira/browse/HDFS-9917 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > SNN was down for sometime because of some reasons..After restarting SNN,it > became unreponsive because > - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), > where as each datanode had only ~2.5 million blocks. > - GC can't trigger on this objects since all will be under RPC queue. > To recover this( to clear this objects) ,restarted all the DN's one by > one..This issue happened in 2.4.1 where split of blockreport was not > available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch
[ https://issues.apache.org/jira/browse/HDFS-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195995#comment-15195995 ] Arpit Agarwal commented on HDFS-10170: -- +1 for force rebase to make DiskBalancer testable on Mac. I suspect we'd have hit a similar issue on Windows. > DiskBalancer: Force rebase diskbalancer branch > -- > > Key: HDFS-10170 > URL: https://issues.apache.org/jira/browse/HDFS-10170 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Minor > Fix For: HDFS-1312 > > > In one of patches we renamed – DiskbalancerException.java to > DiskBalancerException.java. The only change was the small b ==> B, This > causes issues on a Mac where the file system may not be case sensitive. > So when you clone the repo, git ends up creating DiskbalanceException.java > with a small letter ‘b’ and tries to rename it to big letter. However on a > Mac it fails and we get java files where the class name is different from the > file name. > We can fix this issue by re-writing the git history. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface
[ https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8457: Fix Version/s: (was: HDFS-7240) > Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into > parent interface > - > > Key: HDFS-8457 > URL: https://issues.apache.org/jira/browse/HDFS-8457 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8457-HDFS-7240.01.patch, > HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, > HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, > HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch > > > FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The > HDFS-specific parts can continue to be retained in FsDataSpi while those > relating to volume management, block pools and upgrade can be moved to a > parent interface. > There will be no change to implementations of FsDatasetSpi. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets
[ https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8661: Fix Version/s: (was: HDFS-7240) > DataNode should filter the set of NameSpaceInfos passed to Datasets > --- > > Key: HDFS-8661 > URL: https://issues.apache.org/jira/browse/HDFS-8661 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7240 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8661-HDFS-7240.01.patch, > HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, > HDFS-8661-HDFS-7240.04.patch, v03-v04.diff > > > {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset > when adding new volumes. > This list should be filtered by the correct NodeType(s) for each dataset. > e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block > pools and Ozone datasets would be notified of Ozone block pool(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8679) Move DatasetSpi to new package
[ https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8679: Fix Version/s: (was: HDFS-7240) > Move DatasetSpi to new package > -- > > Key: HDFS-8679 > URL: https://issues.apache.org/jira/browse/HDFS-8679 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8679-HDFS-7240.01.patch, > HDFS-8679-HDFS-7240.02.patch > > > The DatasetSpi and VolumeSpi interfaces are currently in > {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a > new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8392) DataNode support for multiple datasets
[ https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8392: Fix Version/s: (was: HDFS-7240) > DataNode support for multiple datasets > -- > > Key: HDFS-8392 > URL: https://issues.apache.org/jira/browse/HDFS-8392 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8392-HDFS-7240.01.patch, > HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch > > > For HDFS-7240 we would like to share available DataNode storage across HDFS > blocks and Ozone objects. > The DataNode already supports sharing available storage across multiple block > pool IDs for the federation feature. However all federated block pools use > the same dataset implementation i.e. {{FsDatasetImpl}}. > We can extend the DataNode to support multiple dataset implementations so the > same storage space can be shared across one or more HDFS block pools and one > or more Ozone block pools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8677) Ozone: Introduce KeyValueContainerDatasetSpi
[ https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-8677. - Resolution: Fixed We'll revisit FsDataset changes later. > Ozone: Introduce KeyValueContainerDatasetSpi > > > Key: HDFS-8677 > URL: https://issues.apache.org/jira/browse/HDFS-8677 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8677-HDFS-7240.01.patch, > HDFS-8677-HDFS-7240.02.patch, HDFS-8677-HDFS-7240.03.patch, > HDFS-8677-HDFS-7240.04.patch, HDFS-8677-HDFS-7240.05.patch > > > KeyValueContainerDatasetSpi will be a new interface for Ozone containers, > just as FsDatasetSpi is an interface for manipulating HDFS block files. > The interface will have support for both key-value containers for storing > Ozone metadata and blobs for storing user data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8679) Move DatasetSpi to new package
[ https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-8679. - Resolution: Later We'll revisit FsDataset changes later. > Move DatasetSpi to new package > -- > > Key: HDFS-8679 > URL: https://issues.apache.org/jira/browse/HDFS-8679 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8679-HDFS-7240.01.patch, > HDFS-8679-HDFS-7240.02.patch > > > The DatasetSpi and VolumeSpi interfaces are currently in > {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a > new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets
[ https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-8661. - Resolution: Later We'll revisit FsDataset changes later. > DataNode should filter the set of NameSpaceInfos passed to Datasets > --- > > Key: HDFS-8661 > URL: https://issues.apache.org/jira/browse/HDFS-8661 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7240 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8661-HDFS-7240.01.patch, > HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, > HDFS-8661-HDFS-7240.04.patch, v03-v04.diff > > > {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset > when adding new volumes. > This list should be filtered by the correct NodeType(s) for each dataset. > e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block > pools and Ozone datasets would be notified of Ozone block pool(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8679) Move DatasetSpi to new package
[ https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8679: - > Move DatasetSpi to new package > -- > > Key: HDFS-8679 > URL: https://issues.apache.org/jira/browse/HDFS-8679 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8679-HDFS-7240.01.patch, > HDFS-8679-HDFS-7240.02.patch > > > The DatasetSpi and VolumeSpi interfaces are currently in > {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a > new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8677) Ozone: Introduce KeyValueContainerDatasetSpi
[ https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8677: - > Ozone: Introduce KeyValueContainerDatasetSpi > > > Key: HDFS-8677 > URL: https://issues.apache.org/jira/browse/HDFS-8677 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8677-HDFS-7240.01.patch, > HDFS-8677-HDFS-7240.02.patch, HDFS-8677-HDFS-7240.03.patch, > HDFS-8677-HDFS-7240.04.patch, HDFS-8677-HDFS-7240.05.patch > > > KeyValueContainerDatasetSpi will be a new interface for Ozone containers, > just as FsDatasetSpi is an interface for manipulating HDFS block files. > The interface will have support for both key-value containers for storing > Ozone metadata and blobs for storing user data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface
[ https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-8457. - Resolution: Later We'll revisit FsDataset changes later. > Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into > parent interface > - > > Key: HDFS-8457 > URL: https://issues.apache.org/jira/browse/HDFS-8457 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8457-HDFS-7240.01.patch, > HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, > HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, > HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch > > > FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The > HDFS-specific parts can continue to be retained in FsDataSpi while those > relating to volume management, block pools and upgrade can be moved to a > parent interface. > There will be no change to implementations of FsDatasetSpi. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets
[ https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8661: - > DataNode should filter the set of NameSpaceInfos passed to Datasets > --- > > Key: HDFS-8661 > URL: https://issues.apache.org/jira/browse/HDFS-8661 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7240 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8661-HDFS-7240.01.patch, > HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, > HDFS-8661-HDFS-7240.04.patch, v03-v04.diff > > > {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset > when adding new volumes. > This list should be filtered by the correct NodeType(s) for each dataset. > e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block > pools and Ozone datasets would be notified of Ozone block pool(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8392) DataNode support for multiple datasets
[ https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-8392. - Resolution: Later We'll revisit FsDataset changes later. > DataNode support for multiple datasets > -- > > Key: HDFS-8392 > URL: https://issues.apache.org/jira/browse/HDFS-8392 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8392-HDFS-7240.01.patch, > HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch > > > For HDFS-7240 we would like to share available DataNode storage across HDFS > blocks and Ozone objects. > The DataNode already supports sharing available storage across multiple block > pool IDs for the federation feature. However all federated block pools use > the same dataset implementation i.e. {{FsDatasetImpl}}. > We can extend the DataNode to support multiple dataset implementations so the > same storage space can be shared across one or more HDFS block pools and one > or more Ozone block pools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface
[ https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8457: - > Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into > parent interface > - > > Key: HDFS-8457 > URL: https://issues.apache.org/jira/browse/HDFS-8457 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8457-HDFS-7240.01.patch, > HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, > HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, > HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch > > > FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The > HDFS-specific parts can continue to be retained in FsDataSpi while those > relating to volume management, block pools and upgrade can be moved to a > parent interface. > There will be no change to implementations of FsDatasetSpi. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8392) DataNode support for multiple datasets
[ https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8392: - > DataNode support for multiple datasets > -- > > Key: HDFS-8392 > URL: https://issues.apache.org/jira/browse/HDFS-8392 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: HDFS-7240 > > Attachments: HDFS-8392-HDFS-7240.01.patch, > HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch > > > For HDFS-7240 we would like to share available DataNode storage across HDFS > blocks and Ozone objects. > The DataNode already supports sharing available storage across multiple block > pool IDs for the federation feature. However all federated block pools use > the same dataset implementation i.e. {{FsDatasetImpl}}. > We can extend the DataNode to support multiple dataset implementations so the > same storage space can be shared across one or more HDFS block pools and one > or more Ozone block pools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195976#comment-15195976 ] Andrew Wang commented on HDFS-3702: --- Since we're just adding a flag to the existing {{create}} flags enumset, it doesn't affect our API signature. Note there are no changes in FileSystem or DistributedFileSystem. It also doesn't involve any NN memory overhead, which is a nice bonus compared to a storage policy with xattrs. I also like this scheme also since it gives us a lot of flexibility at the application level. For example, applications like distcp or the httpfs and nfs gateway might always want this flag on (no matter the destination folder), to avoid data load imbalance. For HBase's WAL, it would give them the flexibility to redo their filesystem layout, for instance if all WALs no longer go in a single "/logs" directory. Overall, it feels a lot like Linux-y filesystem hints like fadvise / madvise, and a good use of flags. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8210) Ozone: Implement storage container manager
[ https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8210: Fix Version/s: (was: HDFS-7240) > Ozone: Implement storage container manager > --- > > Key: HDFS-8210 > URL: https://issues.apache.org/jira/browse/HDFS-8210 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-8210-HDFS-7240.1.patch, > HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, > HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch > > > The storage container manager collects datanode heartbeats, manages > replication and exposes API to lookup containers. This jira implements > storage container manager by re-using the block manager implementation in > namenode. This jira provides initial implementation that works with > datanodes. The additional protocols will be added in subsequent jiras. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8210) Ozone: Implement storage container manager
[ https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reopened HDFS-8210: - We've rebased the branch and this change was left out. Reopening the Jira. > Ozone: Implement storage container manager > --- > > Key: HDFS-8210 > URL: https://issues.apache.org/jira/browse/HDFS-8210 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Fix For: HDFS-7240 > > Attachments: HDFS-8210-HDFS-7240.1.patch, > HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, > HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch > > > The storage container manager collects datanode heartbeats, manages > replication and exposes API to lookup containers. This jira implements > storage container manager by re-using the block manager implementation in > namenode. This jira provides initial implementation that works with > datanodes. The additional protocols will be added in subsequent jiras. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch
[ https://issues.apache.org/jira/browse/HDFS-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDFS-10170. - Resolution: Fixed > DiskBalancer: Force rebase diskbalancer branch > -- > > Key: HDFS-10170 > URL: https://issues.apache.org/jira/browse/HDFS-10170 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Minor > Fix For: HDFS-1312 > > > In one of patches we renamed – DiskbalancerException.java to > DiskBalancerException.java. The only change was the small b ==> B, This > causes issues on a Mac where the file system may not be case sensitive. > So when you clone the repo, git ends up creating DiskbalanceException.java > with a small letter ‘b’ and tries to rename it to big letter. However on a > Mac it fails and we get java files where the class name is different from the > file name. > We can fix this issue by re-writing the git history. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch
Anu Engineer created HDFS-10170: --- Summary: DiskBalancer: Force rebase diskbalancer branch Key: HDFS-10170 URL: https://issues.apache.org/jira/browse/HDFS-10170 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer & mover Affects Versions: HDFS-1312 Reporter: Anu Engineer Assignee: Anu Engineer Priority: Minor Fix For: HDFS-1312 In one of patches we renamed – DiskbalancerException.java to DiskBalancerException.java. The only change was the small b ==> B, This causes issues on a Mac where the file system may not be case sensitive. So when you clone the repo, git ends up creating DiskbalanceException.java with a small letter ‘b’ and tries to rename it to big letter. However on a Mac it fails and we get java files where the class name is different from the file name. We can fix this issue by re-writing the git history. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7648) Verify the datanode directory layout
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195908#comment-15195908 ] Colin Patrick McCabe commented on HDFS-7648: Hi [~rakesh_r], can you rebase the patch on trunk? {code} LOG.warn("Block: " + blockId + " has to be upgraded to block ID-based layout"); {code} Perhaps "Block XYZ is in the wrong directory" would be clearer? +1 once these are addressed. > Verify the datanode directory layout > > > Key: HDFS-7648 > URL: https://issues.apache.org/jira/browse/HDFS-7648 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Rakesh R > Attachments: HDFS-7648-3.patch, HDFS-7648-4.patch, HDFS-7648-5.patch, > HDFS-7648.patch, HDFS-7648.patch > > > HDFS-6482 changed datanode layout to use block ID to determine the directory > to store the block. We should have some mechanism to verify it. Either > DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7648) Verify that HDFS blocks are in the correct datanode directories
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7648: --- Summary: Verify that HDFS blocks are in the correct datanode directories (was: Verify the datanode directory layout) > Verify that HDFS blocks are in the correct datanode directories > --- > > Key: HDFS-7648 > URL: https://issues.apache.org/jira/browse/HDFS-7648 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Rakesh R > Attachments: HDFS-7648-3.patch, HDFS-7648-4.patch, HDFS-7648-5.patch, > HDFS-7648.patch, HDFS-7648.patch > > > HDFS-6482 changed datanode layout to use block ID to determine the directory > to store the block. We should have some mechanism to verify it. Either > DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-9955) DataNode won't self-heal after some block dirs were manually misplaced
[ https://issues.apache.org/jira/browse/HDFS-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-9955. Resolution: Duplicate > DataNode won't self-heal after some block dirs were manually misplaced > -- > > Key: HDFS-9955 > URL: https://issues.apache.org/jira/browse/HDFS-9955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 > Environment: CentOS 6, Cloudera 5.4.4 (patched Hadoop 2.6.0) >Reporter: David Watzke > Labels: data-integrity > > I have accidentally ran this tool on top of DataNode's datadirs (of a > datanode that was shut down at the moment): > https://github.com/killerwhile/volume-balancer > The tool makes assumptions about block directory placement that are no longer > valid in hadoop 2.6.0 and it was just moving them around between different > datadirs to make the disk usage balanced. OK, it was not a good idea to run > it but my concern is the way the datanode was (not) handling the resulting > state. I've seen these messages in DN log (see below) which means DN knew > about this but didn't do anything to fix it (self-heal by copying the other > replica) - which seems like a bug to me. If you need any additional info > please just ask. > {noformat} > 2016-03-04 12:40:06,008 WARN > org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding > block BP-680964103-A.B.C.D-1375882473930:blk_-3159875140074863904_0 on volume > /data/18/cdfs/dn > 2016-03-04 12:40:06,009 WARN > org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding > block BP-680964103-A.B.C.D-1375882473930:blk_8369468090548520777_0 on volume > /data/18/cdfs/dn > 2016-03-04 12:40:06,011 WARN > org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding > block BP-680964103-A.B.C.D-1375882473930:blk_1226431637_0 on volume > /data/18/cdfs/dn > 2016-03-04 12:40:06,012 WARN > org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding > block BP-680964103-A.B.C.D-1375882473930:blk_1169332185_0 on volume > /data/18/cdfs/dn > 2016-03-04 12:40:06,825 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > opReadBlock BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 > received exception java.io.IOException: BlockId 1226781281 is not valid. > 2016-03-04 12:40:06,825 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(X.Y.Z.30, > datanodeUuid=9da950ca-87ae-44ee-9391-0bca669c796b, infoPort=50075, > ipcPort=50020, > storageInfo=lv=-56;cid=cluster12;nsid=1625487778;c=1438754073236):Got > exception while serving > BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 to > /X.Y.Z.30:48146 > java.io.IOException: BlockId 1226781281 is not valid. > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243) > at java.lang.Thread.run(Thread.java:745) > 2016-03-04 12:40:06,826 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > prg04-002.xyz.tld:50010:DataXceiver error processing READ_BLOCK operation > src: /X.Y.Z.30:48146 dst: /X.Y.Z.30:50010 > java.io.IOException: BlockId 1226781281 is not valid. > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at >
[jira] [Commented] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor
[ https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195795#comment-15195795 ] Colin Patrick McCabe commented on HDFS-9951: Thanks for working on this. Can you put the string constants into {{PBImageXmlWriter.java}}? {{OfflineImageReconstructor#Node}} is not a public class. > Use string constants for XML tags in OfflineImageReconstructor > -- > > Key: HDFS-9951 > URL: https://issues.apache.org/jira/browse/HDFS-9951 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Lin Yiqun >Assignee: Lin Yiqun >Priority: Minor > Attachments: HDFS-9551.001.patch, HDFS-9551.002.patch > > > In class {{OfflineImageReconstructor}}, it uses many {{SectionProcessors}} to > process xml files and load the subtree of the XML into a Node structure. But > there are lots of places that node removes key by directively writing value > in methods rather than define them first. Like this: > {code} > Node expiration = directive.removeChild("expiration"); > {code} > We could improve this to define them in Node and them invoked like this way: > {code} > Node expiration=directive.removeChild(Node.CACHE_MANAGER_SECTION_EXPIRATION); > {code} > And it will be good to manager node key's name in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195791#comment-15195791 ] Arpit Agarwal commented on HDFS-3702: - Thanks [~stack], [~eddyxu]. It would be great if we can avoid one off {{createFile}} parameters. What do you think of per-target block placement policies as [proposed in this comment|https://issues.apache.org/jira/browse/HDFS-3702?focusedCommentId=13420775=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13420775] e.g. set a custom placement policy for /hbase/.logs/. The implementation will be easier now that we have extended attributes. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782 ] Colin Patrick McCabe edited comment on HDFS-9668 at 3/15/16 5:48 PM: - Hi [~jingcheng...@intel.com], Thanks again for your comments. I agree that consistency is always a headache. However, we are already "inconsistent" in a bunch of cases. For example, {{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and block ID. But since it drops the lock when it returns, the genstamp may change in between the call to {{getStoredBlock}} and the actual usage of that information. bq. But still it is hard to remove the locks in createRbw, etc where the long-time blocking occur. I think this is what we have to tackle in the future. For {{createRbw}}, it seems like we could: 1. add the entry to the volumeMap 2. drop the lock and attempt to create the block file on-disk 3. if the creation failed, take back the lock and remove the entry from the volumeMap Step #1 would ensure that if another thread attempted to create the same RBW replica, it would fail. bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to support fairness? Currently, read operations have no special advantage over write operations. Using a reader/writer lock changes that. It's easy to come up with a workload where read requests come in often enough so that there is no time at all for write requests. This is especially true since we are doing filesystem I/O while holding the reader lock. We have observed Java Reader/Writer locks to starve writers in practice. That's why there is an option for the FSNamesystem lock to be fair. Hmm. I wonder if, as a first step, we could try moving all the filesystem I/O that we can outside the lock? That would provide a huge performance boost just by itself. And it would make it much easier to have a reader/writer lock later if required. was (Author: cmccabe): Hi [~jingcheng...@intel.com], Thanks again for your comments. I agree that consistency is always a headache. However, we are already "inconsistent" in a bunch of cases. For example, {{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and block ID. But since it drops the lock when it returns, the genstamp may change in between the call to {{getStoredBlock}} and the actual usage of that information. bq. But still it is hard to remove the locks in createRbw, etc where the long-time blocking occur. I think this is what we have to tackle in the future. For {{createRbw}}, it seems like we could: 1. add the entry to the volumeMap 2. drop the lock and attempt to create the block file on-disk 3. if the creation failed, take back the lock and remove the entry from the volumeMap Step #1 would ensure that if another thread attempted to create the same RBW replica, it would fail. bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to support fairness? Currently, read operations have no special advantage over write operations. Using a reader/writer lock changes that. It's easy, even trivial, to come up with a workload where read requests come in often enough so that there is no time at all for write requests. This is especially true since we are doing filesystem I/O while holding the reader lock. We have observed Java Reader/Writer locks to starve writers in practice. That's why there is an option for the FSNamesystem lock to be fair. Hmm. I wonder if, as a first step, we could try moving all the filesystem I/O that we can outside the lock? That would provide a huge performance boost just by itself. And it would make it much easier to have a reader/writer lock later if required. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a >
[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782 ] Colin Patrick McCabe commented on HDFS-9668: Hi [~jingcheng...@intel.com], Thanks again for your comments. I agree that consistency is always a headache. However, we are already "inconsistent" in a bunch of cases. For example, {{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and block ID. But since it drops the lock when it returns, the genstamp may change in between the call to {{getStoredBlock}} and the actual usage of that information. bq. But still it is hard to remove the locks in createRbw, etc where the long-time blocking occur. I think this is what we have to tackle in the future. For {{createRbw}}, it seems like we could: 1. add the entry to the volumeMap 2. drop the lock and attempt to create the block file on-disk 3. if the creation failed, take back the lock and remove the entry from the volumeMap Step #1 would ensure that if another thread attempted to create the same RBW replica, it would fail. bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to support fairness? Currently, both read operations have no special advantage over write operations. Using a reader/writer lock changes that. It's easy, even trivial, to come up with a workload where read requests come in often enough so that there is no time at all for write requests. This is especially true since we are doing filesystem I/O while holding the reader lock. We have observed Java Reader/Writer locks to starve writers in practice. That's why there is an option for the FSNamesystem lock to be fair. Hmm. I wonder if, as a first step, we could try moving all the filesystem I/O that we can outside the lock? That would provide a huge performance boost just by itself. And it would make it much easier to have a reader/writer lock later if required. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread > t@93335 >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140) > - locked <18324c9> (a >
[jira] [Comment Edited] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782 ] Colin Patrick McCabe edited comment on HDFS-9668 at 3/15/16 5:47 PM: - Hi [~jingcheng...@intel.com], Thanks again for your comments. I agree that consistency is always a headache. However, we are already "inconsistent" in a bunch of cases. For example, {{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and block ID. But since it drops the lock when it returns, the genstamp may change in between the call to {{getStoredBlock}} and the actual usage of that information. bq. But still it is hard to remove the locks in createRbw, etc where the long-time blocking occur. I think this is what we have to tackle in the future. For {{createRbw}}, it seems like we could: 1. add the entry to the volumeMap 2. drop the lock and attempt to create the block file on-disk 3. if the creation failed, take back the lock and remove the entry from the volumeMap Step #1 would ensure that if another thread attempted to create the same RBW replica, it would fail. bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to support fairness? Currently, read operations have no special advantage over write operations. Using a reader/writer lock changes that. It's easy, even trivial, to come up with a workload where read requests come in often enough so that there is no time at all for write requests. This is especially true since we are doing filesystem I/O while holding the reader lock. We have observed Java Reader/Writer locks to starve writers in practice. That's why there is an option for the FSNamesystem lock to be fair. Hmm. I wonder if, as a first step, we could try moving all the filesystem I/O that we can outside the lock? That would provide a huge performance boost just by itself. And it would make it much easier to have a reader/writer lock later if required. was (Author: cmccabe): Hi [~jingcheng...@intel.com], Thanks again for your comments. I agree that consistency is always a headache. However, we are already "inconsistent" in a bunch of cases. For example, {{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and block ID. But since it drops the lock when it returns, the genstamp may change in between the call to {{getStoredBlock}} and the actual usage of that information. bq. But still it is hard to remove the locks in createRbw, etc where the long-time blocking occur. I think this is what we have to tackle in the future. For {{createRbw}}, it seems like we could: 1. add the entry to the volumeMap 2. drop the lock and attempt to create the block file on-disk 3. if the creation failed, take back the lock and remove the entry from the volumeMap Step #1 would ensure that if another thread attempted to create the same RBW replica, it would fail. bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to support fairness? Currently, both read operations have no special advantage over write operations. Using a reader/writer lock changes that. It's easy, even trivial, to come up with a workload where read requests come in often enough so that there is no time at all for write requests. This is especially true since we are doing filesystem I/O while holding the reader lock. We have observed Java Reader/Writer locks to starve writers in practice. That's why there is an option for the FSNamesystem lock to be fair. Hmm. I wonder if, as a first step, we could try moving all the filesystem I/O that we can outside the lock? That would provide a huge performance boost just by itself. And it would make it much easier to have a reader/writer lock later if required. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a >
[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
[ https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195706#comment-15195706 ] Ming Ma commented on HDFS-9579: --- Thanks [~liuml07]! Any comments from others? > Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level > - > > Key: HDFS-9579 > URL: https://issues.apache.org/jira/browse/HDFS-9579 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, > HDFS-9579-5.patch, HDFS-9579-6.patch, HDFS-9579-7.patch, HDFS-9579-8.patch, > HDFS-9579-9.patch, HDFS-9579.patch, MR job counters.png > > > For cross DC distcp or other applications, it becomes useful to have insight > as to the traffic volume for each network distance to distinguish cross-DC > traffic, local-DC-remote-rack, etc. > FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To > provide additional metrics for each network distance, we can add additional > metrics to FileSystem level and have {{DFSInputStream}} update the value > based on the network distance between client and the datanode. > {{DFSClient}} will resolve client machine's network location as part of its > initialization. It doesn't need to resolve datanode's network location for > each read as {{DatanodeInfo}} already has the info. > There are existing HDFS specific metrics such as {{ReadStatistics}} and > {{DFSHedgedReadMetrics}}. But these metrics are only accessible via > {{DFSClient}} or {{DFSInputStream}}. Not something that application framework > such as MR and Tez can get to. That is the benefit of storing these new > metrics in FileSystem.Statistics. > This jira only includes metrics generation by HDFS. The consumption of these > metrics at MR and Tez will be tracked by separated jiras. > We can add similar metrics for HDFS write scenario later if it is necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class
[ https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195689#comment-15195689 ] Colin Patrick McCabe commented on HDFS-9895: I think this JIRA was misleadingly titled. The patch is about removing the reference to the {{Configuration}} object inside {{DataNode.java}}, since we have a reference to the exact same configuration object in the base class. It doesn't change which aspects of the configuration we cache. I don't think this affects the thread-safety of anything, or the reconfiguration logic. Just like before the patch, {{DataNode.java}} is still playing with a reference to a thread-safe (but mutable) Configuration. Reconfiguration still happens by means of the {{ReconfigurationThread}} invoking the {{reconfigureProperty}} method. > Remove DataNode#conf because there is already a copy in the base class > -- > > Key: HDFS-9895 > URL: https://issues.apache.org/jira/browse/HDFS-9895 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9895.000.patch > > > Since DataNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, DataNode#conf should be removed for the > purpose of brevity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a reference to it in the base class
[ https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9895: --- Summary: Remove DataNode#conf because there is already a reference to it in the base class (was: Remove DataNode#conf because there is already a copy in the base class) > Remove DataNode#conf because there is already a reference to it in the base > class > - > > Key: HDFS-9895 > URL: https://issues.apache.org/jira/browse/HDFS-9895 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9895.000.patch > > > Since DataNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, DataNode#conf should be removed for the > purpose of brevity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9931) Remove NameNode#conf because there is already a reference to it in the base class
[ https://issues.apache.org/jira/browse/HDFS-9931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9931: --- Summary: Remove NameNode#conf because there is already a reference to it in the base class (was: Remove all cached configuration from NameNode) > Remove NameNode#conf because there is already a reference to it in the base > class > - > > Key: HDFS-9931 > URL: https://issues.apache.org/jira/browse/HDFS-9931 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > > Since NameNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, all cached configurations in NameNode > should be removed for brevity and consistency purpose. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-9895) Remove DataNode#conf because there is already a reference to it in the base class
[ https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195689#comment-15195689 ] Colin Patrick McCabe edited comment on HDFS-9895 at 3/15/16 5:13 PM: - I think this JIRA was misleadingly titled. The patch is about removing the reference to the {{Configuration}} object inside {{DataNode.java}}, since we have a reference to the exact same configuration object in the base class. It doesn't change which aspects of the configuration we cache. I don't think this affects the thread-safety of anything, or the reconfiguration logic. Just like before the patch, {{DataNode.java}} is still playing with a reference to a thread-safe (but mutable) Configuration. Reconfiguration still happens by means of the {{ReconfigurationThread}} invoking the {{reconfigureProperty}} method. There is no case where as "swap a configuration instance"-- the {{Configured}} base class doesn't support swapping in a new object anyway. was (Author: cmccabe): I think this JIRA was misleadingly titled. The patch is about removing the reference to the {{Configuration}} object inside {{DataNode.java}}, since we have a reference to the exact same configuration object in the base class. It doesn't change which aspects of the configuration we cache. I don't think this affects the thread-safety of anything, or the reconfiguration logic. Just like before the patch, {{DataNode.java}} is still playing with a reference to a thread-safe (but mutable) Configuration. Reconfiguration still happens by means of the {{ReconfigurationThread}} invoking the {{reconfigureProperty}} method. > Remove DataNode#conf because there is already a reference to it in the base > class > - > > Key: HDFS-9895 > URL: https://issues.apache.org/jira/browse/HDFS-9895 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9895.000.patch > > > Since DataNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, DataNode#conf should be removed for the > purpose of brevity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class
[ https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9895: --- Description: Since DataNode inherits ReconfigurableBase with Configured as base class where configuration is maintained, DataNode#conf should be removed for the purpose of brevity. (was: Since DataNode inherits ReconfigurableBase with Configured as base class where configuration is maintained, all cached configurations in DataNode should be removed for brevity and consistency purpose.) > Remove DataNode#conf because there is already a copy in the base class > -- > > Key: HDFS-9895 > URL: https://issues.apache.org/jira/browse/HDFS-9895 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9895.000.patch > > > Since DataNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, DataNode#conf should be removed for the > purpose of brevity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class
[ https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9895: --- Summary: Remove DataNode#conf because there is already a copy in the base class (was: Remove all cached configuration from DataNode) > Remove DataNode#conf because there is already a copy in the base class > -- > > Key: HDFS-9895 > URL: https://issues.apache.org/jira/browse/HDFS-9895 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9895.000.patch > > > Since DataNode inherits ReconfigurableBase with Configured as base class > where configuration is maintained, all cached configurations in DataNode > should be removed for brevity and consistency purpose. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195658#comment-15195658 ] Arpit Agarwal commented on HDFS-9847: - Thank you [~linyiqun]. I don't think we need getLongTimeSeconds and getLongTimeMillis either. Callers can just use {{getTimeDuration}}. I also suggest adding a getTimeDuration overload that accepts defaultValue as String so defaults can be defined with units. e.g. {code} conf.getTimeDuration(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT, TimeUnit.SECONDS) // seconds for backwards compatibility. {code} where {{DFS_HEARTBEAT_INTERVAL_DEFAULT = "3s"}}. > HDFS configuration without time unit name should accept friendly time units > --- > > Key: HDFS-9847 > URL: https://issues.apache.org/jira/browse/HDFS-9847 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9847.001.patch, HDFS-9847.002.patch, > HDFS-9847.003.patch, timeduration-w-y.patch > > > In HDFS-9821, it talks about the issue of leting existing keys use friendly > units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names > contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make > some other configurations which without time unit name to accept friendly > time units. The time unit {{seconds}} is frequently used in hdfs. We can > updating this configurations first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195578#comment-15195578 ] Hudson commented on HDFS-9904: -- FAILURE: Integrated in Hadoop-trunk-Commit #9464 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9464/]) HDFS-9904. testCheckpointCancellationDuringUpload occasionally fails. (kihwal: rev d4574017845cfa7521e703f80efd404afd09b8c4) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee >Assignee: Lin Yiqun > Fix For: 2.7.3 > > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-9904. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee >Assignee: Lin Yiqun > Fix For: 2.7.3 > > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195554#comment-15195554 ] Kihwal Lee commented on HDFS-9904: -- I've committed this to trunk through branch-2.7. Thanks for working on this Lin Yiqun. > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee >Assignee: Lin Yiqun > Fix For: 2.7.3 > > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-9904: - Assignee: Lin Yiqun > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee >Assignee: Lin Yiqun > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195531#comment-15195531 ] Kihwal Lee commented on HDFS-9904: -- +1 I've verified that the config is only set for the specific test case. > testCheckpointCancellationDuringUpload occasionally fails > -- > > Key: HDFS-9904 > URL: https://issues.apache.org/jira/browse/HDFS-9904 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Kihwal Lee > Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch > > > The failure was at the end of the test case where the txid of the standby > (former active) is checked. Since the checkpoint/uploading was canceled , it > is not supposed to have the new checkpoint. Looking at the test log, that was > still the case, but the standby then did checkpoint on its own and bumped up > the txid, right before the check was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks
[ https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195522#comment-15195522 ] Hadoop QA commented on HDFS-9694: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 44s {color} | {color:red} hadoop-hdfs-project-jdk1.8.0_74 with JDK v1.8.0_74 generated 1 new + 48 unchanged - 1 fixed = 49 total (was 49) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 3s {color} | {color:red} hadoop-hdfs-project-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 50 unchanged - 1 fixed = 51 total (was 51) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 0s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m
[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195406#comment-15195406 ] Hadoop QA commented on HDFS-9847: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 57s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s {color} | {color:red} root: patch generated 5 new + 1014 unchanged - 1 fixed = 1019 total (was 1015) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 27s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 3s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 13s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | |
[jira] [Commented] (HDFS-9945) Datanode command for evicting writers
[ https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195386#comment-15195386 ] Kihwal Lee commented on HDFS-9945: -- HDFS-2043 TestHFlush HDFS-9780 TestRollingFileSystemSinkWithSecureHdfs HDFS-9950 TestDecommissioningStatus HDFS-10169 TestEditLog HDFS-9767 TestFileAppend HDFS-6532 TestCrcCorruption I will work on some of these. The two checkstyle warnings are about the existing method length being over 150 lines. > Datanode command for evicting writers > - > > Key: HDFS-9945 > URL: https://issues.apache.org/jira/browse/HDFS-9945 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-9945.patch, HDFS-9945.v2.patch > > > It will be useful if there is a command to evict writers from a datanode. > When a set of datanodes are being decommissioned, they can get blocked by > slow writers at the end. It was rare in the old days since mapred jobs > didn't last too long, but with many different types of apps running on > today's YARN cluster, we are often see very long tail in datanode > decommissioning. > I propose a new dfsadmin command, {{evictWriters}}, to be added. I initially > thought about having namenode automatically telling datanodes on > decommissioning, but realized that having a command is more flexible. E.g. > users can choose not to do this at all, choose when to evict writers, or > whether to try multiple times for whatever reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6532) Intermittent test failure org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt
[ https://issues.apache.org/jira/browse/HDFS-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195362#comment-15195362 ] Kihwal Lee commented on HDFS-6532: -- Still happening. {noformat} testCorruptionDuringWrt(org.apache.hadoop.hdfs.TestCrcCorruption) Time elapsed: 50.284 sec <<< ERROR! java.lang.Exception: test timed out after 5 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:764) at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:689) at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:770) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:747) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) at org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt(TestCrcCorruption.java:136) {noformat} > Intermittent test failure > org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt > -- > > Key: HDFS-6532 > URL: https://issues.apache.org/jira/browse/HDFS-6532 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs-client >Affects Versions: 2.4.0 >Reporter: Yongjun Zhang > > Per https://builds.apache.org/job/Hadoop-Hdfs-trunk/1774/testReport, we had > the following failure. Local rerun is successful > {code} > Regression > org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt > Failing for the past 1 build (Since Failed#1774 ) > Took 50 sec. > Error Message > test timed out after 5 milliseconds > Stacktrace > java.lang.Exception: test timed out after 5 milliseconds > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:2024) > at > org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:2008) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2107) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:98) > at > org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt(TestCrcCorruption.java:133) > {code} > See relevant exceptions in log > {code} > 2014-06-14 11:56:15,283 WARN datanode.DataNode > (BlockReceiver.java:verifyChunks(404)) - Checksum error in block > BP-1675558312-67.195.138.30-1402746971712:blk_1073741825_1001 from > /127.0.0.1:41708 > org.apache.hadoop.fs.ChecksumException: Checksum error: > DFSClient_NONMAPREDUCE_-1139495951_8 at 64512 exp: 1379611785 got: -12163112 > at > org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:353) > at > org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:284) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:402) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:537) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:734) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:741) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234) > at java.lang.Thread.run(Thread.java:662) > 2014-06-14 11:56:15,285 WARN datanode.DataNode > (BlockReceiver.java:run(1207)) - IOException in BlockReceiver.run(): > java.io.IOException: Shutting down writer and responder due to a checksum > error in received data. The error response has been sent upstream. > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1352) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1278) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1199) > at java.lang.Thread.run(Thread.java:662) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10169) TestEditLog.testBatchedSyncWithClosedLogs sometimes fails.
Kihwal Lee created HDFS-10169: - Summary: TestEditLog.testBatchedSyncWithClosedLogs sometimes fails. Key: HDFS-10169 URL: https://issues.apache.org/jira/browse/HDFS-10169 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee This failure has been seen multiple precomit builds recently. {noformat} testBatchedSyncWithClosedLogs[1](org.apache.hadoop.hdfs.server.namenode.TestEditLog) Time elapsed: 0.377 sec <<< FAILURE! java.lang.AssertionError: logging edit without syncing should do not affect txid expected:<1> but was:<2> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testBatchedSyncWithClosedLogs(TestEditLog.java:594) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2043) TestHFlush failing intermittently
[ https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195338#comment-15195338 ] Kihwal Lee commented on HDFS-2043: -- This seems to be an actual race in the code. > TestHFlush failing intermittently > - > > Key: HDFS-2043 > URL: https://issues.apache.org/jira/browse/HDFS-2043 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Aaron T. Myers > > I can't reproduce this failure reliably, but it seems like TestHFlush has > been failing intermittently, with the frequency increasing of late. > Note the following two pre-commit test runs from different JIRAs where > TestHFlush seems to have failed spuriously: > https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2043) TestHFlush failing intermittently
[ https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195335#comment-15195335 ] Kihwal Lee commented on HDFS-2043: -- This is how it fails nowadays in precommit. {noformat} testHFlushInterrupted(org.apache.hadoop.hdfs.TestHFlush) Time elapsed: 2.259 sec <<< ERROR! java.nio.channels.ClosedByInterruptException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:501) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at java.io.DataOutputStream.flush(DataOutputStream.java:123) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:653) {noformat} > TestHFlush failing intermittently > - > > Key: HDFS-2043 > URL: https://issues.apache.org/jira/browse/HDFS-2043 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Aaron T. Myers > > I can't reproduce this failure reliably, but it seems like TestHFlush has > been failing intermittently, with the frequency increasing of late. > Note the following two pre-commit test runs from different JIRAs where > TestHFlush seems to have failed spuriously: > https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9961) Ozone: Add buckets commands to CLI
[ https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195302#comment-15195302 ] Hadoop QA commented on HDFS-9961: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 44s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s {color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s {color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s {color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 38s {color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 114m 41s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 42s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 256m 29s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.TestLocalDFS | | | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | |
[jira] [Updated] (HDFS-9928) Make HDFS commands guide up to date
[ https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-9928: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Target Version/s: 2.9.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks, [~jojochuang]! > Make HDFS commands guide up to date > --- > > Key: HDFS-9928 > URL: https://issues.apache.org/jira/browse/HDFS-9928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.9.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: documentation, supportability > Fix For: 2.9.0 > > Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, > HDFS-9928.001.patch > > > A few HDFS subcommands and options are missing in the documentation. > # envvars: display computed Hadoop environment variables > I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be > looking for other missing options as well. > Filling this JIRA to fix them all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date
[ https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195148#comment-15195148 ] Masatake Iwasaki commented on HDFS-9928: +1 > Make HDFS commands guide up to date > --- > > Key: HDFS-9928 > URL: https://issues.apache.org/jira/browse/HDFS-9928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.9.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: documentation, supportability > Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, > HDFS-9928.001.patch > > > A few HDFS subcommands and options are missing in the documentation. > # envvars: display computed Hadoop environment variables > I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be > looking for other missing options as well. > Filling this JIRA to fix them all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9960) OzoneHandler : Add localstorage support for keys
[ https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195096#comment-15195096 ] Hadoop QA commented on HDFS-9960: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 50s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s {color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s {color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s {color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 57s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 10 new + 0 unchanged - 0 fixed = 10 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 58s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 30s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 122m 43s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 53s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 251m 24s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Found reliance on default encoding in org.apache.hadoop.ozone.web.client.OzoneBucket.getKey(String):in org.apache.hadoop.ozone.web.client.OzoneBucket.getKey(String): java.io.ByteArrayOutputStream.toString() At OzoneBucket.java:[line 343] | | | Found reliance on default encoding in org.apache.hadoop.ozone.web.client.OzoneBucket.putKey(String, String):in org.apache.hadoop.ozone.web.client.OzoneBucket.putKey(String, String):
[jira] [Commented] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor
[ https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195069#comment-15195069 ] Hadoop QA commented on HDFS-9951: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 14 new + 21 unchanged - 3 fixed = 35 total (was 24) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 8s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 37s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 139m 3s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | JDK v1.7.0_95 Timed out junit tests |
[jira] [Updated] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks
[ https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-9694: Attachment: HDFS-9694-v6.patch Thanks [~umamaheswararao] for the review and nice suggestions. The renaming makes a lot sense to consider the following work for the real striped approach for striped file and blocks. I absorbed your thinkings and adapted some bit, resulting this updated patch. Change summary: * Javadoc update for {{getFileChecksum}}, done; * Renamed: StripedFileChecksumComputer => StripedFileNonStripedChecksumComputer; StripedBlockChecksumComputer => NonStripedBlockGroupChecksumComputer. Please let me know if this works for you or not, thanks. > Make existing DFSClient#getFileChecksum() work for striped blocks > - > > Key: HDFS-9694 > URL: https://issues.apache.org/jira/browse/HDFS-9694 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, > HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, HDFS-9694-v6.patch > > > This is a sub-task of HDFS-8430 and will get the existing API > {{FileSystem#getFileChecksum(path)}} work for striped files. It will also > refactor existing codes and layout basic work for subsequent tasks like > support of the new API proposed there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date
[ https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195015#comment-15195015 ] Hadoop QA commented on HDFS-9928: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 10m 21s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793482/HDFS-9928-trunk.003.patch | | JIRA Issue | HDFS-9928 | | Optional Tests | asflicense mvnsite | | uname | Linux 74037850fcfc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / eba66a6 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/14823/artifact/patchprocess/whitespace-eol.txt | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14823/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Make HDFS commands guide up to date > --- > > Key: HDFS-9928 > URL: https://issues.apache.org/jira/browse/HDFS-9928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.9.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: documentation, supportability > Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, > HDFS-9928.001.patch > > > A few HDFS subcommands and options are missing in the documentation. > # envvars: display computed Hadoop environment variables > I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be > looking for other missing options as well. > Filling this JIRA to fix them all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10167) CLONE - Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt performance
[ https://issues.apache.org/jira/browse/HDFS-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10167: - > CLONE - Erasure Coding: when recovering lost blocks, logs can be too verbose > and hurt performance > - > > Key: HDFS-10167 > URL: https://issues.apache.org/jira/browse/HDFS-10167 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Rui Li > > When we test reading data with datanodes killed, > {{DFSInputStream::getBestNodeDNAddrPair}} becomes a hot spot method and > effectively blocks the client JVM. This log seems too verbose: > {code} > if (chosenNode == null) { > DFSClient.LOG.warn("No live nodes contain block " + block.getBlock() + > " after checking nodes = " + Arrays.toString(nodes) + > ", ignoredNodes = " + ignoredNodes); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10130) CLONE - Erasure Coding: handle missing internal block locations in DFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10130: - > CLONE - Erasure Coding: handle missing internal block locations in > DFSStripedInputStream > > > Key: HDFS-10130 > URL: https://issues.apache.org/jira/browse/HDFS-10130 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Jing Zhao > > Currently DFSStripedInputStream assumes we always have complete internal > block location information, i.e., we can always get all the DataNodes for a > striped block group. In a lot of scenarios the client cannot get complete > block location info, e.g., some internal blocks are missing and the NameNode > has not finished the recovery yet. We should add functionality to handle > missing block locations in DFSStripedInputStream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10165) CLONE - Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10165: - > CLONE - Erasure coding: update EC command "-s" flag to "-p" when specifying > policy > -- > > Key: HDFS-10165 > URL: https://issues.apache.org/jira/browse/HDFS-10165 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Zhe Zhang > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10160) CLONE - Erasure coding: fix 2 failed tests of DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10160: - > CLONE - Erasure coding: fix 2 failed tests of DFSStripedOutputStream > > > Key: HDFS-10160 > URL: https://issues.apache.org/jira/browse/HDFS-10160 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Walter Su >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10139) CLONE - Erasure Coding: the number of chunks in packet is not updated when writing parity data
[ https://issues.apache.org/jira/browse/HDFS-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10139: - > CLONE - Erasure Coding: the number of chunks in packet is not updated when > writing parity data > -- > > Key: HDFS-10139 > URL: https://issues.apache.org/jira/browse/HDFS-10139 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Li Bo > > The member {{numChunks}} in {{DFSPacket}} is always zero if this packet > contains parity data. The calling of {{getNumChunks}} may cause potential > errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (HDFS-10147) CLONE - Erasure Coding: add test for namenode process over replicated striped block
[ https://issues.apache.org/jira/browse/HDFS-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B deleted HDFS-10147: - > CLONE - Erasure Coding: add test for namenode process over replicated striped > block > --- > > Key: HDFS-10147 > URL: https://issues.apache.org/jira/browse/HDFS-10147 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: dragon >Assignee: Takuya Fukudome > -- This message was sent by Atlassian JIRA (v6.3.4#6332)