[jira] [Updated] (HDDS-1766) ContainerStateMachine is unable to increment lastAppliedTermIndex
[ https://issues.apache.org/jira/browse/HDDS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-1766: Status: Patch Available (was: Open) > ContainerStateMachine is unable to increment lastAppliedTermIndex > - > > Key: HDDS-1766 > URL: https://issues.apache.org/jira/browse/HDDS-1766 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Mukul Kumar Singh >Priority: Blocker > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ContainerStateMachine#updateLastApplied currently updates the > lastAppliedTermIndex using applyTransactionCompletionMap. There are null > entries in the applyTransactionCompletionMap causing the lastAppliedIndex to > not be incremented. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1782) Add an option to MiniOzoneChaosCluster to read files multiple times.
[ https://issues.apache.org/jira/browse/HDDS-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-1782: Status: Patch Available (was: Open) > Add an option to MiniOzoneChaosCluster to read files multiple times. > > > Key: HDDS-1782 > URL: https://issues.apache.org/jira/browse/HDDS-1782 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now MiniOzoneChaosCluster writes a file/ reads it and deletes it > immediately. This jira proposes to add an option to read the file multiple > time in MiniOzoneChaosCluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1783) Latency metric for applyTransaction in ContainerStateMachine
Supratim Deka created HDDS-1783: --- Summary: Latency metric for applyTransaction in ContainerStateMachine Key: HDDS-1783 URL: https://issues.apache.org/jira/browse/HDDS-1783 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka applyTransaction is invoked from the Ratis pipeline and the ContainerStateMachine uses a async executor to complete the task. We require a latency metric to track the performance of log apply operations in the state machine. This will measure the end-to-end latency of apply which includes the queueing delay in the executor queues. Combined with the latency measurement in HddsDispatcher, this will be an indicator if the executors are overloaded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882632#comment-16882632 ] Hadoop QA commented on HDFS-14593: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 30s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup | | | hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.5 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | HDFS-14593 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974281/HDFS-14593.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 2cedcc8737bc 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 555dabf | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27201/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27201/testReport/ | | Max. process+thread count | 1587 (vs. ulimit
[jira] [Updated] (HDDS-1782) Add an option to MiniOzoneChaosCluster to read files multiple times.
[ https://issues.apache.org/jira/browse/HDDS-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-1782: - Labels: pull-request-available (was: ) > Add an option to MiniOzoneChaosCluster to read files multiple times. > > > Key: HDDS-1782 > URL: https://issues.apache.org/jira/browse/HDDS-1782 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > > Right now MiniOzoneChaosCluster writes a file/ reads it and deletes it > immediately. This jira proposes to add an option to read the file multiple > time in MiniOzoneChaosCluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1782) Add an option to MiniOzoneChaosCluster to read files multiple times.
[ https://issues.apache.org/jira/browse/HDDS-1782?focusedWorklogId=275150=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275150 ] ASF GitHub Bot logged work on HDDS-1782: Author: ASF GitHub Bot Created on: 11/Jul/19 03:23 Start Date: 11/Jul/19 03:23 Worklog Time Spent: 10m Work Description: mukul1987 commented on pull request #1076: HDDS-1782. Add an option to MiniOzoneChaosCluster to read files multiple times. Contributed by Mukul Kumar Singh. URL: https://github.com/apache/hadoop/pull/1076 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 275150) Time Spent: 10m Remaining Estimate: 0h > Add an option to MiniOzoneChaosCluster to read files multiple times. > > > Key: HDDS-1782 > URL: https://issues.apache.org/jira/browse/HDDS-1782 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now MiniOzoneChaosCluster writes a file/ reads it and deletes it > immediately. This jira proposes to add an option to read the file multiple > time in MiniOzoneChaosCluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14623) In NameNode Web UI, for Head the file (first 32K) old data is showing
[ https://issues.apache.org/jira/browse/HDFS-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882625#comment-16882625 ] Wei-Chiu Chuang commented on HDFS-14623: [~hemanthboyina] thanks for the report & fix. Would it be possible to attach screenshots before/after the fix, so that it is easier to other folks to verify the fix? [~nandakumar131] this is related to your fix in HDFS-13823. Could you help review this one and push it forward? > In NameNode Web UI, for Head the file (first 32K) old data is showing > - > > Key: HDFS-14623 > URL: https://issues.apache.org/jira/browse/HDFS-14623 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14623.patch > > > In Namenode Web UI , for Head the file (first 32K) > After opening multiple files and clicking on - "Head the file" is showing > wrong data -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1782) Add an option to MiniOzoneChaosCluster to read files multiple times.
Mukul Kumar Singh created HDDS-1782: --- Summary: Add an option to MiniOzoneChaosCluster to read files multiple times. Key: HDDS-1782 URL: https://issues.apache.org/jira/browse/HDDS-1782 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Right now MiniOzoneChaosCluster writes a file/ reads it and deletes it immediately. This jira proposes to add an option to read the file multiple time in MiniOzoneChaosCluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1752) ConcurrentModificationException while handling DeadNodeHandler event
[ https://issues.apache.org/jira/browse/HDDS-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882621#comment-16882621 ] Mukul Kumar Singh commented on HDDS-1752: - The problem in HDDS-1372 was with the replica map. The map which is being iterated here is Node2PipelineMap. There are 2 different datastructures being iterated here. I think the problem here is that we are accessing the pipeline from NodeManager and at the same time, the pipeline are being removed from NodeManager when destroying the pipeline. > ConcurrentModificationException while handling DeadNodeHandler event > > > Key: HDDS-1752 > URL: https://issues.apache.org/jira/browse/HDDS-1752 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Hrishikesh Gadre >Priority: Major > Labels: MiniOzoneChaosCluster > > ConcurrentModificationException while handling DeadNodeHandler event > {code} > 2019-07-02 19:29:25,190 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(88)) - Error on execution > message 56591ec5-c9e4-416c-9a36-db0507739fe5{ip: 192.168.0.2, host: 192.16 > 8.0.2, networkLocation: /default-rack, certSerialId: null} > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442) > at java.util.HashMap$KeyIterator.next(HashMap.java:1466) > at java.lang.Iterable.forEach(Iterable.java:74) > at > java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$destroyPipelines$1(DeadNodeHandler.java:99) > at java.util.Optional.ifPresent(Optional.java:159) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.destroyPipelines(DeadNodeHandler.java:98) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:78) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:44) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14644) That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode
[ https://issues.apache.org/jira/browse/HDFS-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-14644: --- Component/s: namenode > That replication of block failed leads to decommission is blocked when the > number of replicas of block is greater than the number of datanode > - > > Key: HDFS-14644 > URL: https://issues.apache.org/jira/browse/HDFS-14644 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.1.1, 2.9.2, 3.0.3, 2.8.5, 2.7.7 >Reporter: Lisheng Sun >Priority: Major > > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All > required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1781) Add ContainerCache metrics in ContainerMetrics
Supratim Deka created HDDS-1781: --- Summary: Add ContainerCache metrics in ContainerMetrics Key: HDDS-1781 URL: https://issues.apache.org/jira/browse/HDDS-1781 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Supratim Deka ContainerCache cache handles to open Container DB instances. This LRU cache is configured with a limited capacity (1024 entries default). Add metrics to track the performance of this cache(hits : misses) and also track the average latency to acquire a DB handle in case of cache miss. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882600#comment-16882600 ] Hadoop QA commented on HDFS-14593: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 58s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 73m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken | | | hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14593 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974279/HDFS-14593.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 8463bbe84cdc 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 555dabf | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27200/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results |
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882595#comment-16882595 ] Takanobu Asanuma commented on HDFS-14593: - Thanks for your help, [~ayushtkn]. Sorry, it should be MILLISECONDS. Uploaded 009.patch fixing it. > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch, > HDFS-14593.009.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882594#comment-16882594 ] Weiwei Yang commented on HDFS-12748: Thanks [~xkrogen], I've committed this to branch-3.1, and cherry picked to branch-3.0, branch-2, branch-2.9 and branch-2.8. Now it is fixed on all major branches. Closing it. Thanks for the review [~xkrogen], [~hanishakoneru], [~daryn]. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12748: --- Fix Version/s: 2.9.3 2.8.6 > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-14593: Attachment: HDFS-14593.009.patch > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch, > HDFS-14593.009.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14318) dn cannot be recognized and must be restarted to recognize the Repaired disk
[ https://issues.apache.org/jira/browse/HDFS-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882592#comment-16882592 ] hunshenshi commented on HDFS-14318: --- yeah, I will > dn cannot be recognized and must be restarted to recognize the Repaired disk > > > Key: HDFS-14318 > URL: https://issues.apache.org/jira/browse/HDFS-14318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: hunshenshi >Priority: Major > Attachments: HDFS-14318.patch > > > dn detected that disk a has failed. After disk a is repaired, dn cannot be > recognized and must be restarted to recognize > > I make a patch to dn for recognize the repaired disk without restart dn -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12748: --- Fix Version/s: 2.10.0 > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12748: --- Fix Version/s: 3.0.4 > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14644) That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode
[ https://issues.apache.org/jira/browse/HDFS-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-14644: --- Issue Type: Bug (was: Improvement) > That replication of block failed leads to decommission is blocked when the > number of replicas of block is greater than the number of datanode > - > > Key: HDFS-14644 > URL: https://issues.apache.org/jira/browse/HDFS-14644 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.1, 2.9.2, 3.0.3, 2.8.5, 2.7.7 >Reporter: Lisheng Sun >Priority: Major > > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All > required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14644) That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode
[ https://issues.apache.org/jira/browse/HDFS-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-14644: --- Issue Type: Improvement (was: Bug) > That replication of block failed leads to decommission is blocked when the > number of replicas of block is greater than the number of datanode > - > > Key: HDFS-14644 > URL: https://issues.apache.org/jira/browse/HDFS-14644 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.1, 2.9.2, 3.0.3, 2.8.5, 2.7.7 >Reporter: Lisheng Sun >Priority: Major > > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All > required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14644) That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode
Lisheng Sun created HDFS-14644: -- Summary: That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode Key: HDFS-14644 URL: https://issues.apache.org/jira/browse/HDFS-14644 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.7, 2.8.5, 3.0.3, 2.9.2, 3.1.1 Reporter: Lisheng Sun 2019-07-10,15:37:18,028 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 5 to reach 10 (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} 2019-07-10,15:37:18,028 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 5 to reach 10 (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14573) Backport Standby Read to branch-3
[ https://issues.apache.org/jira/browse/HDFS-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882583#comment-16882583 ] Hadoop QA commented on HDFS-14573: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 2s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 27 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 39s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 26s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 49s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 38s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 36s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 38s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 23s{color} | {color:red} root generated 1 new + 1275 unchanged - 1 fixed = 1276 total (was 1276) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 43s{color} | {color:orange} root: The patch generated 26 new + 2565 unchanged - 10 fixed = 2591 total (was 2575) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 40s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 45s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 52s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 46s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 75m 8s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}336m 25s{color} | {color:black} {color} | \\ \\ ||
[jira] [Updated] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12748: --- Fix Version/s: 3.1.3 > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882582#comment-16882582 ] Ayush Saxena commented on HDFS-14593: - Thanx [~tasanuma] for the patch. Just started checking, A doubt : {code:java} MembershipState.setDeletionMs(conf.getTimeDuration( RBFConfigKeys.FEDERATION_STORE_MEMBERSHIP_EXPIRATION_DELETION_MS, RBFConfigKeys .FEDERATION_STORE_MEMBERSHIP_EXPIRATION_DELETION_MS_DEFAULT, TimeUnit.MICROSECONDS)); {code} Is this intentionally done, MICROSECONDS or it should be MILLISECONDS? > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882560#comment-16882560 ] Takanobu Asanuma commented on HDFS-14593: - Thanks for your review, [~elgoiri]. The last patch contains a checkstyle issue. Uploaded 008.patch fixing it. > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-14593: Attachment: HDFS-14593.008.patch > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14595) HDFS-11848 breaks API compatibility
[ https://issues.apache.org/jira/browse/HDFS-14595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882517#comment-16882517 ] Wei-Chiu Chuang commented on HDFS-14595: Thanks [~smeng]. bq. Shall we file a separate jira for updating the release doc? No we don't need a separate jira. Just update the release doc in this jira. Would you also take care of the checksty warning? Specifically, is it necessary to deprecate {{getUnderConstructionFiles()}}? (Sorry i don't have a java editor with me) > HDFS-11848 breaks API compatibility > --- > > Key: HDFS-14595 > URL: https://issues.apache.org/jira/browse/HDFS-14595 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.0, 3.1.2 >Reporter: Wei-Chiu Chuang >Assignee: Siyao Meng >Priority: Blocker > Attachments: HDFS-14595.001.patch, hadoop_ > 36e1870eab904d5a6f12ecfb1fdb52ca08d95ac5 to > b241194d56f97ee372cbec7062bcf155bc3df662 compatibility report.htm > > > Our internal tool caught an API compatibility issue with HDFS-11848. > HDFS-11848 adds an additional parameter to > DistributedFileSystem.listOpenFiles(), but it doesn't keep the existing API. > This can cause issue when upgrading from Hadoop 2.9.0/2.8.3/3.0.0 to > 3.0.1/3.1.0 and above. > Suggest: > (1) Add back the old API (which was added in HDFS-10480), and mark it > deprecated. > (2) Update release doc to enforce running API compatibility check for each > releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882474#comment-16882474 ] Íñigo Goiri commented on HDFS-14593: Thanks [~tasanuma] for taking care of it. This looks good to me now. [~ayushtkn] do you mind taking a look too? > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882443#comment-16882443 ] Hadoop QA commented on HDFS-14593: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 17s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 5s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup | | | hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.5 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | HDFS-14593 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974262/HDFS-14593.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux cde4f048d168 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6872efc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/27198/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | unit |
[jira] [Commented] (HDDS-1529) BlockInputStream: Avoid buffer copy if the whole chunk is being read
[ https://issues.apache.org/jira/browse/HDDS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882434#comment-16882434 ] Hanisha Koneru commented on HDDS-1529: -- [~hgadre], HDDS-1496 adds support to read partial chunks. When chunks are read from disk, they are stored in a local buffer and then the required part of the chunk is copied to the client buffer. This is required when the chunk boundary to be read does not coincide with the checksum boundary. But when we are reading the whole chunk, we do not need to do double copy i.e. copy from disk to local buffer and then to client buffer. We can directly copy the date from disk to client buffer. Please let me know if this makes sense or if you have any questions. > BlockInputStream: Avoid buffer copy if the whole chunk is being read > > > Key: HDDS-1529 > URL: https://issues.apache.org/jira/browse/HDDS-1529 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hrishikesh Gadre >Priority: Major > > Currently, BlockInputStream reads chunk data from DNs and puts it in a local > buffer and then copies the data to clients buffer. This is required for > partial chunk reads where extra chunk data than requested might have to be > read so that checksum verification can be done. But if the whole chunk is > being read, we can copy the data directly into client buffer and avoid double > buffer copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14573) Backport Standby Read to branch-3
[ https://issues.apache.org/jira/browse/HDFS-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-14573: -- Attachment: HDFS-14573-branch-3.1.003.patch > Backport Standby Read to branch-3 > - > > Key: HDFS-14573 > URL: https://issues.apache.org/jira/browse/HDFS-14573 > Project: Hadoop HDFS > Issue Type: Task > Components: hdfs >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-14573-branch-3.0.001.patch, > HDFS-14573-branch-3.1.001.patch, HDFS-14573-branch-3.1.002.patch, > HDFS-14573-branch-3.1.003.patch, HDFS-14573-branch-3.2.001.patch, > HDFS-14573-branch-3.2.002.patch, HDFS-14573-branch-3.2.003.patch, > HDFS-14573-branch-3.2.004.patch > > > This Jira tracks backporting the feature consistent read from standby > (HDFS-12943) to branch-3.x, including 3.0, 3.1, 3.2. This is required for > backporting to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14573) Backport Standby Read to branch-3
[ https://issues.apache.org/jira/browse/HDFS-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882413#comment-16882413 ] Chen Liang commented on HDFS-14573: --- Post v003 patch to fix TestHdfsConfigFields fail, the other fails are unrelated. > Backport Standby Read to branch-3 > - > > Key: HDFS-14573 > URL: https://issues.apache.org/jira/browse/HDFS-14573 > Project: Hadoop HDFS > Issue Type: Task > Components: hdfs >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-14573-branch-3.0.001.patch, > HDFS-14573-branch-3.1.001.patch, HDFS-14573-branch-3.1.002.patch, > HDFS-14573-branch-3.1.003.patch, HDFS-14573-branch-3.2.001.patch, > HDFS-14573-branch-3.2.002.patch, HDFS-14573-branch-3.2.003.patch, > HDFS-14573-branch-3.2.004.patch > > > This Jira tracks backporting the feature consistent read from standby > (HDFS-12943) to branch-3.x, including 3.0, 3.1, 3.2. This is required for > backporting to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882384#comment-16882384 ] Takanobu Asanuma commented on HDFS-14593: - Thanks, [~elgoiri]. bq. My only concern now is the fact that we are going to the backend (say ZooKeeper) twice, right? Yes, you are right. It's definitely a problem. Uploaded 007.patch. It refers the backend once, and it removes expired records from both the cache and the backend in {{overrideExpiredRecord()}}. To modify records in the cache, it includes two additional changes. * use ArrayList instead of Collections.singletonList in {{CachedRecordStore#overrideExpiredRecord(R record)}} * not use Collections.unmodifiableList in {{QueryResult#getRecords()}} I'm not sure whether the changes of 007.patch can be acceptable. If not, it may be better just not to change the order in {{CachedRecordStore#loadCache()}}. We may be able to compromise the behavior that I mentioned before. bq. If not changing the order, expired records keep existing in the cache(result at CachedRecordStore#118) after deleting them from State Store by overrideExpiredRecords(). Then, they are deleted from the cache in the next loadCache. > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-14593: Attachment: HDFS-14593.007.patch > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures
[ https://issues.apache.org/jira/browse/HDFS-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882373#comment-16882373 ] Hadoop QA commented on HDFS-14640: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 41s{color} | {color:red} hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra in trunk has 5 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 2s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 18s{color} | {color:green} The patch generated 0 new + 106 unchanged - 132 fixed = 106 total (was 238) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s{color} | {color:green} hadoop-dynamometer-workload in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 26s{color} | {color:green} hadoop-dynamometer-infra in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14640 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974249/HDFS-14640.000.patch | | Optional Tests | dupname asflicense compile
[jira] [Commented] (HDDS-1544) Support default Acls for volume, bucket, keys and prefix
[ https://issues.apache.org/jira/browse/HDDS-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882370#comment-16882370 ] Hadoop QA commented on HDDS-1544: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 14s{color} | {color:red} HDDS-1544 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDDS-1544 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973809/HDDS-1544.00.patch | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/2752/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Support default Acls for volume, bucket, keys and prefix > > > Key: HDDS-1544 > URL: https://issues.apache.org/jira/browse/HDDS-1544 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1544.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Add dAcls for volume, bucket, keys and prefix -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882365#comment-16882365 ] Hadoop QA commented on HDFS-14410: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 30m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 47m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14410 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974253/HDFS-14410.002.patch | | Optional Tests | dupname asflicense mvnsite xml | | uname | Linux 30242e9a4c2a 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / eccc9a4 | | maven | version: Apache Maven 3.3.9 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/27196/artifact/out/whitespace-tabs.txt | | Max. process+thread count | 447 (vs. ulimit of 1) | | modules | C: hadoop-project hadoop-tools/hadoop-dynamometer U: . | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27196/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > HDFS-14410.002.patch, dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1752) ConcurrentModificationException while handling DeadNodeHandler event
[ https://issues.apache.org/jira/browse/HDDS-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882366#comment-16882366 ] Hrishikesh Gadre commented on HDDS-1752: [~msingh] The patch for HDDS-1372 changed the type of set implementation used to store the container replicas ([https://github.com/apache/hadoop/pull/682]). In Ozone 0.4.0 we were using HashSet (which is not thread safe). This patch changed the implementation to use concurrent hash set (which is thread safe). Hence after this change we will not see the exception mentioned in the description of this jira. Can you please take a look and comment? > ConcurrentModificationException while handling DeadNodeHandler event > > > Key: HDDS-1752 > URL: https://issues.apache.org/jira/browse/HDDS-1752 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Hrishikesh Gadre >Priority: Major > Labels: MiniOzoneChaosCluster > > ConcurrentModificationException while handling DeadNodeHandler event > {code} > 2019-07-02 19:29:25,190 ERROR events.SingleThreadExecutor > (SingleThreadExecutor.java:lambda$onMessage$1(88)) - Error on execution > message 56591ec5-c9e4-416c-9a36-db0507739fe5{ip: 192.168.0.2, host: 192.16 > 8.0.2, networkLocation: /default-rack, certSerialId: null} > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442) > at java.util.HashMap$KeyIterator.next(HashMap.java:1466) > at java.lang.Iterable.forEach(Iterable.java:74) > at > java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$destroyPipelines$1(DeadNodeHandler.java:99) > at java.util.Optional.ifPresent(Optional.java:159) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.destroyPipelines(DeadNodeHandler.java:98) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:78) > at > org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:44) > at > org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1544) Support default Acls for volume, bucket, keys and prefix
[ https://issues.apache.org/jira/browse/HDDS-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-1544: - Status: Patch Available (was: Open) > Support default Acls for volume, bucket, keys and prefix > > > Key: HDDS-1544 > URL: https://issues.apache.org/jira/browse/HDDS-1544 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1544.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Add dAcls for volume, bucket, keys and prefix -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1544) Support default Acls for volume, bucket, keys and prefix
[ https://issues.apache.org/jira/browse/HDDS-1544?focusedWorklogId=275009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275009 ] ASF GitHub Bot logged work on HDDS-1544: Author: ASF GitHub Bot Created on: 10/Jul/19 19:12 Start Date: 10/Jul/19 19:12 Worklog Time Spent: 10m Work Description: ajayydv commented on pull request #1074: HDDS-1544. Support default Acls for volume, bucket, keys and prefix. Contributed by Ajay Kumar. URL: https://github.com/apache/hadoop/pull/1074 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 275009) Time Spent: 10m Remaining Estimate: 0h > Support default Acls for volume, bucket, keys and prefix > > > Key: HDDS-1544 > URL: https://issues.apache.org/jira/browse/HDDS-1544 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1544.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Add dAcls for volume, bucket, keys and prefix -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1544) Support default Acls for volume, bucket, keys and prefix
[ https://issues.apache.org/jira/browse/HDDS-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-1544: - Labels: pull-request-available (was: ) > Support default Acls for volume, bucket, keys and prefix > > > Key: HDDS-1544 > URL: https://issues.apache.org/jira/browse/HDDS-1544 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1544.00.patch > > > Add dAcls for volume, bucket, keys and prefix -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14642) processMisReplicatedBlocks does not return correct processed count
[ https://issues.apache.org/jira/browse/HDFS-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882363#comment-16882363 ] Hrishikesh Gadre commented on HDFS-14642: - [~sodonnell] patch looks good. Thanks for reporting and fixing this issue (and sorry about introducing this bug in the first place). > processMisReplicatedBlocks does not return correct processed count > -- > > Key: HDFS-14642 > URL: https://issues.apache.org/jira/browse/HDFS-14642 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.2.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-14642.001.patch > > > HDFS-14053 introduced a method "processMisReplicatedBlocks" to the > blockManager, and it is used by fsck to schedule mis-replicated blocks for > replication. > The method should return a the number of blocks it processed, but it always > returns zero as "processed" is never incremented in the method. > It should also drop and re-take the write lock every "numBlocksPerIteration" > but as processed is never incremented, it will never drop and re-take the > write lock, giving potential for holding the write lock for a long time. > {code:java} > public int processMisReplicatedBlocks(List blocks) { > int processed = 0; > Iterator iter = blocks.iterator(); > try { > while (isPopulatingReplQueues() && namesystem.isRunning() > && !Thread.currentThread().isInterrupted() > && iter.hasNext()) { > int limit = processed + numBlocksPerIteration; > namesystem.writeLockInterruptibly(); > try { > while (iter.hasNext() && processed < limit) { > BlockInfo blk = iter.next(); > MisReplicationResult r = processMisReplicatedBlock(blk); > LOG.debug("BLOCK* processMisReplicatedBlocks: " + > "Re-scanned block {}, result is {}", blk, r); > } > } finally { > namesystem.writeUnlock(); > } > } > } catch (InterruptedException ex) { > LOG.info("Caught InterruptedException while scheduling replication work" + > " for mis-replicated blocks"); > Thread.currentThread().interrupt(); > } > return processed; > }{code} > Due to this, fsck causes a warning to be logged in the NN for every > mis-replicated file it schedules replication for, as it checks the processed > count: > {code:java} > 2019-07-10 15:46:14,790 WARN namenode.NameNode: Fsck: Block manager is able > to process only 0 mis-replicated blocks (Total count : 1 ) for path /...{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882358#comment-16882358 ] Eric Yang commented on HDDS-1554: - [~elek] Thank you for the thorough review, and here are my answers: {quote}There are no clear separation between the two areas and it makes it impossible to run the same tests in any other environment. As an example: existing smoketests can be easily executed in kubernetes.{quote} StartCluster/StopCluster method is only using docker-compose to start a distributed cluster at this time. It can be extended to perform launching Ozone cluster on different environment such as YARN service later to keep the scope of the current feature set in check. Smoke test only runs in a closed environment. There is no exposure to outside network. Ozone cli client works inside docker container. This limits the tests from interacting with Ozone cluster from a external network. By embedding robot framework in docker container, this poses another risk of inseparable test artifacts in docker container that can not be removed later. Test artifacts makes up 400+MB of the current docker image which is 100% more bloated than necessary. Even when test artifact is separated later, we have no way to be sure that the docker image can properly function without test artifact because the tests can only work in docker containers. The fault injection tests have mapped the ports manually to external network. It can provide better coverage on testing docker network with external environment today. Smoke test can be modified to work with external network but effectively doubling the installation of test tool chain on host system, and doubling the shell script configuration templating to run in Robot framework on host system. Given maven is already a dev toolchain set in stone for Hadoop, and minimalist mindset to make best use of the tool chain. I chose to stay in maven environment to let existing tool chain do the style and error checking for these tests. I respect your approach to write the smoke test in shell + robot framework, but fault injection tests can perform more efficiently with help of Java tool chain imho. {quote}Using `@FixMethodOrder(MethodSorters.NAME_ASCENDING)` is especially error-prone and shows that we use unit test in a way which is different from the normal usage (usually the @Test methods should be independent){quote} There is nothing wrong with using feature, and implemented base on popular request. It is the developer who doesn't know how to use this feature correctly can create problem for themselves. I will group the entire flow into one test case because you don't like using this annotation. {quote}B.) The current tests uses the approach to use external mounts to reuse the /data directory. It introduces new problems (see the workarounds with the id/UID). All of them can be avoided with using internal docker volumes which makes it possible to achieve the same without additional workarounds (ITDiskReadOnly doesn't require direct access to the volume ITDiskCorruption.addCorruption can be replaced with simple shell executions `echo "asd" > ../db`){quote} There are two possible cases where data becomes read-only. B.1) The disk is mounted as read-only, or B.2) the data file is read-only. It would be nice if there are distinct error message to inform system administrator to make adjustment for mounting the data disk, or he made a error when copying data during servicing the server. ITReadOnly test is focusing on case #B.1 and can be expanded to case #B.2, if necessary. Using internal path can not clearly test case #B.1. {quote}C.) The unit test files contains a lot of code duplication inside and outside the cluster. For example ITDiskReadOnly.startCluster and ITDiskReadOnly.stopCluster are almost the same. The logic of waiting for the safe mode is duplicated in all of the subprojects.{quote} Good point, I will clean up in next patch. {quote}D.) I would use JUnit assertion everywhere. For me combining java and junit assertions are confusing (ITDiskReadOnly.startCluster/stopCluster) E.) I would use Autoclosable try-catch blocks (or at least finally blocks) to close opened resources. It would help to define the boundaries where the resources are used.{quote} Setup procedures are not tests. It is common to throw IOException and InterruptedException to handle exceptions in JUnit setup method. This is the reason that they are written this way. You are correct that using autoclosable block is nicer to close file system resources, and will make the change. {quote}F.) We don't need to catch `Exceptions` and `fail` as it's already handled by JUnit framework. Just remove try and catch block.{quote} I like to be able to identify the starting point of the log. This is the reason that I use try, catch and additional string
[jira] [Updated] (HDDS-1735) Create separated unit and integration test executor dev-support scripts
[ https://issues.apache.org/jira/browse/HDDS-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-1735: --- Summary: Create separated unit and integration test executor dev-support scripts (was: Create separate unit and integration test executor dev-support script) > Create separated unit and integration test executor dev-support scripts > --- > > Key: HDDS-1735 > URL: https://issues.apache.org/jira/browse/HDDS-1735 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2019-07-02 at 3.25.33 PM.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > hadoop-ozone/dev-support/checks directory contains multiple helper script to > execute different type of testing (findbugs, rat, unit, build). > They easily define how tests should be executed, with the following contract: > * The problems should be printed out to the console > * in case of test failure a non zero exit code should be used > > The tests are working well (in fact I have some experiments with executing > these scripts on k8s and argo where all the shell scripts are executed > parallel) but we need some update: > 1. Most important: the unit tests and integration tests can be separated. > Integration tests are more flaky and it's better to have a way to run only > the normal unit tests > 2. As HDDS-1115 introduced a pom.ozone.xml it's better to use them instead > of the magical "am pl hadoop-ozone-dist" trick-- > 3. To make it possible to run blockade test in containers we should use - T > flag with docker-compose > 4. checkstyle violations are printed out to the console -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13916) Distcp SnapshotDiff not completely implemented for supporting WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882346#comment-16882346 ] Hadoop QA commented on HDFS-13916: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} hadoop-tools/hadoop-distcp: The patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 51s{color} | {color:green} hadoop-distcp in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-13916 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941443/HDFS-13916.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 916683e2dfc9 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 90b10a0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27195/testReport/ | | Max. process+thread count | 461 (vs. ulimit of 1) | | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27195/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Distcp SnapshotDiff not completely implemented for supporting WebHdfs >
[jira] [Commented] (HDFS-14499) Misleading REM_QUOTA value with snasphot and trash feature enabled for a directory
[ https://issues.apache.org/jira/browse/HDFS-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882340#comment-16882340 ] Tsz Wo Nicholas Sze commented on HDFS-14499: {code} + int id = lastSnapshotId != Snapshot.CURRENT_STATE_ID ? snapshotId : + this.lastSnapshotId; {code} It should be {{snapshotId != Snapshot.CURRENT_STATE_ID}}. The patch looks good other than that. > Misleading REM_QUOTA value with snasphot and trash feature enabled for a > directory > -- > > Key: HDFS-14499 > URL: https://issues.apache.org/jira/browse/HDFS-14499 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Attachments: HDFS-14499.000.patch, HDFS-14499.001.patch > > > This is the flow of steps where we see a discrepancy between REM_QUOTA and > new file operation failure. REM_QUOTA shows a value of 1 but file creation > operation does not succeed. > {code:java} > hdfs@c3265-node3 root$ hdfs dfs -mkdir /dir1 > hdfs@c3265-node3 root$ hdfs dfsadmin -setQuota 2 /dir1 > hdfs@c3265-node3 root$ hdfs dfsadmin -allowSnapshot /dir1 > Allowing snaphot on /dir1 succeeded > hdfs@c3265-node3 root$ hdfs dfs -touchz /dir1/file1 > hdfs@c3265-node3 root$ hdfs dfs -createSnapshot /dir1 snap1 > Created snapshot /dir1/.snapshot/snap1 > hdfs@c3265-node3 root$ hdfs dfs -count -v -q /dir1 > QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE > PATHNAME > 2 0 none inf 1 1 0 /dir1 > hdfs@c3265-node3 root$ hdfs dfs -rm /dir1/file1 > 19/03/26 11:20:25 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://smajetinn/dir1/file1' to trash at: > hdfs://smajetinn/user/hdfs/.Trash/Current/dir1/file11553599225772 > hdfs@c3265-node3 root$ hdfs dfs -count -v -q /dir1 > QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE > PATHNAME > 2 1 none inf 1 0 0 /dir1 > hdfs@c3265-node3 root$ hdfs dfs -touchz /dir1/file1 > touchz: The NameSpace quota (directories and files) of directory /dir1 is > exceeded: quota=2 file count=3{code} > The issue here, is that the count command takes only files and directories > into account not the inode references. When trash is enabled, the deletion of > files inside a directory actually does a rename operation as a result of > which an inode reference is maintained in the deleted list of the snapshot > diff which is taken into account while computing the namespace quota, but > count command (getContentSummary()) ,just takes into account just the files > and directories, not the referenced entity for calculating the REM_QUOTA. The > referenced entity is taken into account for space quota only. > InodeReference.java: > --- > {code:java} > @Override > public final ContentSummaryComputationContext computeContentSummary( > int snapshotId, ContentSummaryComputationContext summary) { > final int s = snapshotId < lastSnapshotId ? snapshotId : lastSnapshotId; > // only count storagespace for WithName > final QuotaCounts q = computeQuotaUsage( > summary.getBlockStoragePolicySuite(), getStoragePolicyID(), false, > s); > summary.getCounts().addContent(Content.DISKSPACE, q.getStorageSpace()); > summary.getCounts().addTypeSpaces(q.getTypeSpaces()); > return summary; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882338#comment-16882338 ] Hudson commented on HDDS-1611: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16883 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16883/]) HDDS-1611.[Addendum] Evaluate ACL on volume bucket key and prefix to (aengineer: rev 6872efcabfd8fad5658642baa26df0e74399348b) * (edit) hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Attachments: HDDS-1611-fix-trunk.patch > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14642) processMisReplicatedBlocks does not return correct processed count
[ https://issues.apache.org/jira/browse/HDFS-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882337#comment-16882337 ] Hadoop QA commented on HDFS-14642: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 2s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | HDFS-14642 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974226/HDFS-14642.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e6636f0f0bb1 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bbf5844 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27193/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27193/testReport/ | | Max. process+thread count | 3059 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27193/console | |
[jira] [Resolved] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-1611. Resolution: Fixed > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Attachments: HDDS-1611-fix-trunk.patch > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882336#comment-16882336 ] Anu Engineer commented on HDDS-1611: Attached and committed the build break patch provided by [~ajayydv]. Thank you for the quick response. > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Attachments: HDDS-1611-fix-trunk.patch > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-1611: --- Attachment: HDDS-1611-fix-trunk.patch > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Attachments: HDDS-1611-fix-trunk.patch > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882333#comment-16882333 ] Ajay Kumar edited comment on HDDS-1611 at 7/10/19 6:23 PM: --- [~anu], [~xyao] thanks for reviews. was (Author: ajayydv): [~anu], [~xyao] thanks reviews. > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882333#comment-16882333 ] Ajay Kumar commented on HDDS-1611: -- [~anu], [~xyao] thanks reviews. > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14410: --- Attachment: HDFS-14410.002.patch > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > HDFS-14410.002.patch, dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882331#comment-16882331 ] Erik Krogen commented on HDFS-14410: Rebased for v002 > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > HDFS-14410.002.patch, dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor
[ https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882329#comment-16882329 ] Hudson commented on HDFS-12703: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16882 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16882/]) HDFS-12703. Exceptions are fatal to decommissioning monitor. Contributed (inigoiri: rev eccc9a40deda212cb367627f6f4cc35f5c619941) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java > Exceptions are fatal to decommissioning monitor > --- > > Key: HDFS-12703 > URL: https://issues.apache.org/jira/browse/HDFS-12703 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: He Xiaoqiao >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, > HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, > HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, > HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, > HDFS-12703.012.patch, HDFS-12703.013.patch > > > The {{DecommissionManager.Monitor}} runs as an executor scheduled task. If > an exception occurs, all decommissioning ceases until the NN is restarted. > Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the > task encounters an exception, subsequent executions are suppressed*. The > monitor thread is alive but blocked waiting for an executor task that will > never come. The code currently disposes of the future so the actual > exception that aborted the task is gone. > Failover is insufficient since the task is also likely dead on the standby. > Replication queue init after the transition to active will fix the under > replication of blocks on currently decommissioning nodes but future nodes > never decommission. The standby must be bounced prior to failover – and > hopefully the error condition does not reoccur. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14622) [Dynamometer] State transition err when CCM( HDFS Centralized Cache Management) feature is used
[ https://issues.apache.org/jira/browse/HDFS-14622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882327#comment-16882327 ] Hudson commented on HDFS-14622: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16882 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16882/]) HDFS-14622. [Dynamometer] Update XML FsImage parsing logic to ignore (xkrogen: rev 90b10a0d54c9f16ef423707a33be290b1e3f8c3f) * (edit) hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen/src/main/java/org/apache/hadoop/tools/dynamometer/blockgenerator/XMLParser.java * (edit) hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen/src/test/java/org/apache/hadoop/tools/dynamometer/blockgenerator/TestXMLParser.java > [Dynamometer] State transition err when CCM( HDFS Centralized Cache > Management) feature is used > --- > > Key: HDFS-14622 > URL: https://issues.apache.org/jira/browse/HDFS-14622 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Reporter: TanYuxin >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14622.000.patch > > > When we used dynamometer to test HDFS performance, the test encountered a > error when generate DataNode Block info, then the generation process failed. > The error stack is > {code:java} > Error: java.io.IOException: State transition not allowed; from DEFAULT to > FILE_WITH_REPLICATION at > com.linkedin.dynamometer.blockgenerator.XMLParser.transitionTo(XMLParser.java:107) > at > com.linkedin.dynamometer.blockgenerator.XMLParser.parseLine(XMLParser.java:77) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:53) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:26) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:828) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > {code} > After checking Fsimage xml and the source code, I find that *XMLParser* can > not parse the lines correctly, these lines are like > > {code:java} > 8963/user/somepath/path13cache_other_pool1544454142310false > > 8964/user/somepath/path23cache_hadoop-data_pool1544497817686false > > 8965/user/somepath/path33cache_hadoop-peisong_pool1544451500312false > > 8967/user/somepath/path43cache_other_pool1544497602570false > {code} > > These fsimage xml lines are generated when [HDFS Centralized Cache Management > (CCM)|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html] > feature is used. > I have a discuss with [~xkrogen] > [here|[https://github.com/linkedin/dynamometer/pull/77]], and some patches > provided can fix the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882328#comment-16882328 ] Hudson commented on HDDS-1611: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16882 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16882/]) HDDS-1611. Evaluate ACL on volume bucket key and prefix to authorize (aengineer: rev cdb20adfcce22beb4f232f91822b190119d098ce) * (edit) hadoop-ozone/dist/src/main/smoketest/security/ozone-secure-fs.robot * (edit) hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java * (edit) hadoop-ozone/dist/src/main/smoketest/createbucketenv.robot * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManager.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/S3BucketManagerImpl.java * (edit) hadoop-ozone/dist/src/main/smoketest/s3/awss3.robot * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmBlockVersioning.java * (add) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/security/acl/package-info.java * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmAcls.java * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/web/client/TestVolume.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/protocolPB/OMPBHelper.java * (edit) hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java * (edit) hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/scm/cli/SQLCLI.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/PrefixManagerImpl.java * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestOzoneConfigurationFields.java * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/web/utils/OzoneUtils.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/BucketManagerImpl.java * (edit) hadoop-ozone/dist/src/main/smoketest/commonlib.robot * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java * (edit) hadoop-ozone/dist/src/main/compose/ozonesecure-mr/docker-config * (edit) hadoop-ozone/dist/src/main/compose/ozonesecure/docker-config * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/IOzoneAcl.java * (edit) hadoop-ozone/dist/src/main/smoketest/createmrenv.robot * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OzoneAcl.java * (edit) hadoop-ozone/dist/src/main/smoketest/kinit.robot * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancer.java * (edit) hadoop-ozone/dist/src/main/smoketest/__init__.robot * (edit) hadoop-ozone/dist/src/main/smoketest/basic/ozone-shell.robot * (add) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/security/acl/TestOzoneNativeAuthorizer.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/VolumeManagerImpl.java * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/ListVolumeHandler.java * (edit) hadoop-ozone/common/src/main/proto/OzoneManagerProtocol.proto * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/IAccessAuthorizer.java * (edit) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManager.java * (edit) hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java * (edit) hadoop-ozone/dist/src/main/smoketest/ozonefs/ozonefs.robot * (add) hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/security/acl/OzoneNativeAuthorizer.java * (edit) hadoop-ozone/common/src/test/java/org/apache/hadoop/ozone/TestOzoneAcls.java * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmOzoneAclMap.java * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time
[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor
[ https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882323#comment-16882323 ] Íñigo Goiri commented on HDFS-12703: Thanks [~hexiaoqiao] and [~xuel1] for working on this and [~sodonnell] for the review. Committed to trunk and branch-3.2. > Exceptions are fatal to decommissioning monitor > --- > > Key: HDFS-12703 > URL: https://issues.apache.org/jira/browse/HDFS-12703 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: He Xiaoqiao >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, > HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, > HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, > HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, > HDFS-12703.012.patch, HDFS-12703.013.patch > > > The {{DecommissionManager.Monitor}} runs as an executor scheduled task. If > an exception occurs, all decommissioning ceases until the NN is restarted. > Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the > task encounters an exception, subsequent executions are suppressed*. The > monitor thread is alive but blocked waiting for an executor task that will > never come. The code currently disposes of the future so the actual > exception that aborted the task is gone. > Failover is insufficient since the task is also likely dead on the standby. > Replication queue init after the transition to active will fix the under > replication of blocks on currently decommissioning nodes but future nodes > never decommission. The standby must be bounced prior to failover – and > hopefully the error condition does not reoccur. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures
[ https://issues.apache.org/jira/browse/HDFS-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882324#comment-16882324 ] Erik Krogen commented on HDFS-14640: The issue was being caused by how {{find}} works; the {{-depth}} argument was working fine on my local (Mac OS) development machine, but failing on Linux boxes. I've switched to using {{-maxdepth}} and {{-mindepth}} which appears to be more cross-platform compatible. I also updated the Surefire timeout to be higher; it was masking the test failure because Surefire was timing out at the same time the test timeout was being triggered. Finally I improved the logic in the test to exit more eagerly if the application fails to start up altogether. > [Dynamometer] Fix TestDynamometerInfra failures > --- > > Key: HDFS-14640 > URL: https://issues.apache.org/jira/browse/HDFS-14640 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test, tools >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14640.000.patch > > > I've been seeing Jenkins reporting some failures of the > {{TestDynamometerInfra}} test (basically a big integration test). It seems > like it's timing out after 15 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures
[ https://issues.apache.org/jira/browse/HDFS-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14640: --- Status: Patch Available (was: In Progress) > [Dynamometer] Fix TestDynamometerInfra failures > --- > > Key: HDFS-14640 > URL: https://issues.apache.org/jira/browse/HDFS-14640 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test, tools >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14640.000.patch > > > I've been seeing Jenkins reporting some failures of the > {{TestDynamometerInfra}} test (basically a big integration test). It seems > like it's timing out after 15 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures
[ https://issues.apache.org/jira/browse/HDFS-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14640: --- Attachment: HDFS-14640.000.patch > [Dynamometer] Fix TestDynamometerInfra failures > --- > > Key: HDFS-14640 > URL: https://issues.apache.org/jira/browse/HDFS-14640 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test, tools >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14640.000.patch > > > I've been seeing Jenkins reporting some failures of the > {{TestDynamometerInfra}} test (basically a big integration test). It seems > like it's timing out after 15 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882318#comment-16882318 ] Anu Engineer commented on HDDS-1611: Reverting this since this ic causing a build break in Ozone. [INFO] Apache Hadoop Ozone FileSystem . FAILURE [ 1.672 s] [INFO] Apache Hadoop Ozone FileSystem Single Jar Library .. SKIPPED [INFO] Apache Hadoop Ozone FileSystem Legacy Jar Library .. SKIPPED [INFO] Apache Hadoop Ozone Tools .. SKIPPED [INFO] Apache Hadoop Ozone Datanode ... SKIPPED [INFO] Apache Hadoop Ozone In-Place Upgrade ... SKIPPED [INFO] Apache Hadoop Ozone Distribution ... SKIPPED [INFO] Apache Hadoop Ozone Fault Injection Tests .. SKIPPED [INFO] Apache Hadoop Ozone Network Tests .. SKIPPED [INFO] Apache Hadoop Ozone Main ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 01:43 min [INFO] Finished at: 2019-07-10T11:08:09-07:00 [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-ozone-filesystem: Compilation failure [ERROR] /Users/aengineer/apache/hadoop/hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java:[635,27] method getFileStatus in interface org.apache.hadoop.fs.ozone.OzoneClientAdapter cannot be applied to given types; [ERROR] required: java.lang.String,java.net.URI,org.apache.hadoop.fs.Path,java.lang.String [ERROR] found: java.lang.String [ERROR] reason: actual and formal argument lists differ in length [ERROR] [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hadoop-ozone-filesystem > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer reopened HDDS-1611: > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12703) Exceptions are fatal to decommissioning monitor
[ https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-12703: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.1 3.3.0 Status: Resolved (was: Patch Available) > Exceptions are fatal to decommissioning monitor > --- > > Key: HDFS-12703 > URL: https://issues.apache.org/jira/browse/HDFS-12703 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: He Xiaoqiao >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, > HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, > HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, > HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, > HDFS-12703.012.patch, HDFS-12703.013.patch > > > The {{DecommissionManager.Monitor}} runs as an executor scheduled task. If > an exception occurs, all decommissioning ceases until the NN is restarted. > Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the > task encounters an exception, subsequent executions are suppressed*. The > monitor thread is alive but blocked waiting for an executor task that will > never come. The code currently disposes of the future so the actual > exception that aborted the task is gone. > Failover is insufficient since the task is also likely dead on the standby. > Replication queue init after the transition to active will fix the under > replication of blocks on currently decommissioning nodes but future nodes > never decommission. The standby must be bounced prior to failover – and > hopefully the error condition does not reoccur. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-1611. Resolution: Fixed Fix Version/s: 0.4.1 0.5.0 Thanks for the patch. I have committed this patch to the trunk. I will cherry pick this to branch 0.4.1 shortly. > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1611) Evaluate ACL on volume bucket key and prefix to authorize access
[ https://issues.apache.org/jira/browse/HDDS-1611?focusedWorklogId=274989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-274989 ] ASF GitHub Bot logged work on HDDS-1611: Author: ASF GitHub Bot Created on: 10/Jul/19 18:04 Start Date: 10/Jul/19 18:04 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #973: HDDS-1611. Evaluate ACL on volume bucket key and prefix to authorize access. Contributed by Ajay Kumar. URL: https://github.com/apache/hadoop/pull/973 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 274989) Time Spent: 6h 50m (was: 6h 40m) > Evaluate ACL on volume bucket key and prefix to authorize access > - > > Key: HDDS-1611 > URL: https://issues.apache.org/jira/browse/HDDS-1611 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0, 0.4.1 > > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14458) Report pmem stats to namenode
[ https://issues.apache.org/jira/browse/HDFS-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882301#comment-16882301 ] Rakesh R commented on HDFS-14458: - Thanks [~PhiloHe] for taking this ahead. Added few comments: # By default {{dfs.datanode.max.locked.memory}} is zero. Do you wants to disable in-memory caching if PMem-cache is enabled? If yes, please add a log message to convey the same. Could you try adding a unit test to automate this behavior. {code} this.memCacheStats = new MemoryCacheStats(0L); {code} # I'd prefer to avoid {{if (isPmemCacheEnabled())}} checks inside FsDatasetCache. How about {{cacheLoader#initialize(this)}} returns {{memStats}} ? {code} MemoryCacheStats stats = cacheLoader.initialize(this); {code} # Appreciate if you could add unit test the results of PMem stats. Thanks! > Report pmem stats to namenode > - > > Key: HDFS-14458 > URL: https://issues.apache.org/jira/browse/HDFS-14458 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14458.000.patch, HDFS-14458.001.patch, > HDFS-14458.002.patch > > > Currently, two important stats should be reported to NameNode: cache used and > cache capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882300#comment-16882300 ] Hadoop QA commented on HDFS-14410: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 13s{color} | {color:red} HDFS-14410 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14410 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27194/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13916) Distcp SnapshotDiff not completely implemented for supporting WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-13916: - Description: [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, there is no modification for the real java class which is used by launching the command "hadoop distcp ..." You can check in the latest version here: [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] In the method "preSyncCheck" of the class "DistCpSync", we still check if the file system is DFS. So I propose to change the class DistCpSync in order to take into consideration what was committed by Lokesh Jain. was: [~ljain] has worked on the JIRA: https://issues.apache.org/jira/browse/HDFS-13052 to provide the possibility to make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, there is no modification for the real java class which is used by launching the command "hadoop distcp ..." You can check in the latest version here: [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] In the method "preSyncCheck" of the class "DistCpSync", we still check if the file system is DFS. So I propose to change the class DistCpSync in order to take into consideration what was committed by Lokesh Jain. > Distcp SnapshotDiff not completely implemented for supporting WebHdfs > - > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor
[ https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882293#comment-16882293 ] Xue Liu commented on HDFS-12703: +1 on the latest patch Looks great to me! Thanks [~hexiaoqiao] for working on the patch and guys for reviewing! > Exceptions are fatal to decommissioning monitor > --- > > Key: HDFS-12703 > URL: https://issues.apache.org/jira/browse/HDFS-12703 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: He Xiaoqiao >Priority: Critical > Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, > HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, > HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, > HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, > HDFS-12703.012.patch, HDFS-12703.013.patch > > > The {{DecommissionManager.Monitor}} runs as an executor scheduled task. If > an exception occurs, all decommissioning ceases until the NN is restarted. > Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the > task encounters an exception, subsequent executions are suppressed*. The > monitor thread is alive but blocked waiting for an executor task that will > never come. The code currently disposes of the future so the actual > exception that aborted the task is gone. > Failover is insufficient since the task is also likely dead on the standby. > Replication queue init after the transition to active will fix the under > replication of blocks on currently decommissioning nodes but future nodes > never decommission. The standby must be bounced prior to failover – and > hopefully the error condition does not reoccur. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits
[ https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882282#comment-16882282 ] Ayush Saxena commented on HDFS-12733: - Thanx [~hexiaoqiao] To me with an addl. config seems quite fair enough. {quote} with one more unit test is ready. {quote} If there is no one objecting to the approach, I guess you may proceed, updating the tests. > Option to disable to namenode local edits > - > > Key: HDFS-12733 > URL: https://issues.apache.org/jira/browse/HDFS-12733 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode, performance >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, > HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch > > > As of now, Edits will be written in local and shared locations which will be > redundant and local edits never used in HA setup. > Disabling local edits gives little performance improvement. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits
[ https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882271#comment-16882271 ] He Xiaoqiao commented on HDFS-12733: [~brahmareddy],[~elgoiri],[~ayushtkn],[~shv] any furthermore suggestions for this issue? > Option to disable to namenode local edits > - > > Key: HDFS-12733 > URL: https://issues.apache.org/jira/browse/HDFS-12733 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode, performance >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, > HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch > > > As of now, Edits will be written in local and shared locations which will be > redundant and local edits never used in HA setup. > Disabling local edits gives little performance improvement. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14318) dn cannot be recognized and must be restarted to recognize the Repaired disk
[ https://issues.apache.org/jira/browse/HDFS-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882270#comment-16882270 ] He Xiaoqiao commented on HDFS-14318: Hi [~hunhun], It is interesting feature, would you like continue to push this feature forward? > dn cannot be recognized and must be restarted to recognize the Repaired disk > > > Key: HDFS-14318 > URL: https://issues.apache.org/jira/browse/HDFS-14318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: hunshenshi >Priority: Major > Attachments: HDFS-14318.patch > > > dn detected that disk a has failed. After disk a is repaired, dn cannot be > recognized and must be restarted to recognize > > I make a patch to dn for recognize the repaired disk without restart dn -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14281) Dynamometer Phase 2
[ https://issues.apache.org/jira/browse/HDFS-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14281: --- Component/s: (was: namenode) > Dynamometer Phase 2 > --- > > Key: HDFS-14281 > URL: https://issues.apache.org/jira/browse/HDFS-14281 > Project: Hadoop HDFS > Issue Type: Task > Components: test, tools >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > > Phase 1: HDFS-12345 > This is the Phase 2 umbrella jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14281) Dynamometer Phase 2
[ https://issues.apache.org/jira/browse/HDFS-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14281: --- Component/s: tools > Dynamometer Phase 2 > --- > > Key: HDFS-14281 > URL: https://issues.apache.org/jira/browse/HDFS-14281 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode, test, tools >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > > Phase 1: HDFS-12345 > This is the Phase 2 umbrella jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures
[ https://issues.apache.org/jira/browse/HDFS-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14640: --- Component/s: tools test > [Dynamometer] Fix TestDynamometerInfra failures > --- > > Key: HDFS-14640 > URL: https://issues.apache.org/jira/browse/HDFS-14640 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test, tools >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > > I've been seeing Jenkins reporting some failures of the > {{TestDynamometerInfra}} test (basically a big integration test). It seems > like it's timing out after 15 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14622) [Dynamometer] State transition err when CCM( HDFS Centralized Cache Management) feature is used
[ https://issues.apache.org/jira/browse/HDFS-14622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14622: --- Issue Type: Sub-task (was: Bug) Parent: HDFS-14281 > [Dynamometer] State transition err when CCM( HDFS Centralized Cache > Management) feature is used > --- > > Key: HDFS-14622 > URL: https://issues.apache.org/jira/browse/HDFS-14622 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Reporter: TanYuxin >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14622.000.patch > > > When we used dynamometer to test HDFS performance, the test encountered a > error when generate DataNode Block info, then the generation process failed. > The error stack is > {code:java} > Error: java.io.IOException: State transition not allowed; from DEFAULT to > FILE_WITH_REPLICATION at > com.linkedin.dynamometer.blockgenerator.XMLParser.transitionTo(XMLParser.java:107) > at > com.linkedin.dynamometer.blockgenerator.XMLParser.parseLine(XMLParser.java:77) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:53) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:26) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:828) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > {code} > After checking Fsimage xml and the source code, I find that *XMLParser* can > not parse the lines correctly, these lines are like > > {code:java} > 8963/user/somepath/path13cache_other_pool1544454142310false > > 8964/user/somepath/path23cache_hadoop-data_pool1544497817686false > > 8965/user/somepath/path33cache_hadoop-peisong_pool1544451500312false > > 8967/user/somepath/path43cache_other_pool1544497602570false > {code} > > These fsimage xml lines are generated when [HDFS Centralized Cache Management > (CCM)|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html] > feature is used. > I have a discuss with [~xkrogen] > [here|[https://github.com/linkedin/dynamometer/pull/77]], and some patches > provided can fix the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14622) [Dynamometer] State transition err when CCM( HDFS Centralized Cache Management) feature is used
[ https://issues.apache.org/jira/browse/HDFS-14622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882268#comment-16882268 ] Erik Krogen commented on HDFS-14622: Thanks [~jojochuang]. I'm working on addressing the findbugs issue in HADOOP-16418. I just committed this to trunk. > [Dynamometer] State transition err when CCM( HDFS Centralized Cache > Management) feature is used > --- > > Key: HDFS-14622 > URL: https://issues.apache.org/jira/browse/HDFS-14622 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: TanYuxin >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14622.000.patch > > > When we used dynamometer to test HDFS performance, the test encountered a > error when generate DataNode Block info, then the generation process failed. > The error stack is > {code:java} > Error: java.io.IOException: State transition not allowed; from DEFAULT to > FILE_WITH_REPLICATION at > com.linkedin.dynamometer.blockgenerator.XMLParser.transitionTo(XMLParser.java:107) > at > com.linkedin.dynamometer.blockgenerator.XMLParser.parseLine(XMLParser.java:77) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:53) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:26) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:828) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > {code} > After checking Fsimage xml and the source code, I find that *XMLParser* can > not parse the lines correctly, these lines are like > > {code:java} > 8963/user/somepath/path13cache_other_pool1544454142310false > > 8964/user/somepath/path23cache_hadoop-data_pool1544497817686false > > 8965/user/somepath/path33cache_hadoop-peisong_pool1544451500312false > > 8967/user/somepath/path43cache_other_pool1544497602570false > {code} > > These fsimage xml lines are generated when [HDFS Centralized Cache Management > (CCM)|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html] > feature is used. > I have a discuss with [~xkrogen] > [here|[https://github.com/linkedin/dynamometer/pull/77]], and some patches > provided can fix the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14622) [Dynamometer] State transition err when CCM( HDFS Centralized Cache Management) feature is used
[ https://issues.apache.org/jira/browse/HDFS-14622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14622: --- Resolution: Fixed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) > [Dynamometer] State transition err when CCM( HDFS Centralized Cache > Management) feature is used > --- > > Key: HDFS-14622 > URL: https://issues.apache.org/jira/browse/HDFS-14622 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: TanYuxin >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14622.000.patch > > > When we used dynamometer to test HDFS performance, the test encountered a > error when generate DataNode Block info, then the generation process failed. > The error stack is > {code:java} > Error: java.io.IOException: State transition not allowed; from DEFAULT to > FILE_WITH_REPLICATION at > com.linkedin.dynamometer.blockgenerator.XMLParser.transitionTo(XMLParser.java:107) > at > com.linkedin.dynamometer.blockgenerator.XMLParser.parseLine(XMLParser.java:77) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:53) > at > com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:26) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:828) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > {code} > After checking Fsimage xml and the source code, I find that *XMLParser* can > not parse the lines correctly, these lines are like > > {code:java} > 8963/user/somepath/path13cache_other_pool1544454142310false > > 8964/user/somepath/path23cache_hadoop-data_pool1544497817686false > > 8965/user/somepath/path33cache_hadoop-peisong_pool1544451500312false > > 8967/user/somepath/path43cache_other_pool1544497602570false > {code} > > These fsimage xml lines are generated when [HDFS Centralized Cache Management > (CCM)|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html] > feature is used. > I have a discuss with [~xkrogen] > [here|[https://github.com/linkedin/dynamometer/pull/77]], and some patches > provided can fix the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14410: --- Attachment: (was: HDFS-14410.001.patch) > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14410: --- Attachment: HDFS-14410.001.patch > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14410: --- Attachment: HDFS-14410.001.patch > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14410: --- Attachment: dynamometer_guide_rendered.001.png > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch, HDFS-14410.001.patch, > dynamometer_guide_rendered.001.png > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site
[ https://issues.apache.org/jira/browse/HDFS-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882266#comment-16882266 ] Erik Krogen commented on HDFS-14410: Thanks for the review [~linyiqun]. I've added the "step" descriptions and whole section (with diagrams) regarding the architecture, as well as a few links to external resources. I've attached a v001 patch and a screenshot of the rendered output. > Make Dynamometer documentation properly compile onto the Hadoop site > > > Key: HDFS-14410 > URL: https://issues.apache.org/jira/browse/HDFS-14410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14410.000.patch > > > The documentation included with Dynamometer doesn't properly appear on the > site, we need to twiddle with this a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1775) Make OM KeyDeletingService compatible with HA model
[ https://issues.apache.org/jira/browse/HDDS-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-1775: - Status: Patch Available (was: Open) > Make OM KeyDeletingService compatible with HA model > --- > > Key: HDDS-1775 > URL: https://issues.apache.org/jira/browse/HDDS-1775 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Currently OM KeyDeletingService directly deletes all the keys in DeletedTable > after deleting the corresponding blocks through SCM. For HA compatibility, > the key purging should happen through the OM Ratis server. This Jira > introduces PurgeKeys request in OM protocol. This request will be submitted > to OMs Ratis server after SCM deletes blocks corresponding to deleted keys. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1649) On installSnapshot notification from OM leader, download checkpoint and reload OM state
[ https://issues.apache.org/jira/browse/HDDS-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-1649: - Status: Patch Available (was: Open) > On installSnapshot notification from OM leader, download checkpoint and > reload OM state > --- > > Key: HDDS-1649 > URL: https://issues.apache.org/jira/browse/HDDS-1649 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > When an OM follower receives installSnapshot notification from OM leader, it > should initiate a new checkpoint on the OM leader and download that > checkpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882253#comment-16882253 ] Eric Yang commented on HDDS-1554: - Thank you for the review [~arp]. {quote}I think the docker-compose.yaml files should wrap ${UID}:${GID} in quotes.{quote} Agreed, I found this is required for docker-compose on Mac, but not on Linux. I will make correction for this. {quote}Another issue I ran into: Since the docker inline build with -Ddocker-build generates the image with ${user.name}/ozone, should we use the same in the docker-compose files instead of apache/ozone?{quote} The compose file uses docker.image as reference to image name. This allows inheritance of image name from top level. The only inconsistency is allowing two distinct defaults based on docker-build profile is activated or not. The concern was raised In HDDS-1667, but [~elek] does not review my patch unless I followed exactly what he specified. Marton's argument was that Ozone Kubernetes development requires a distinct prefix repository to pull docker image for the distributed environment during development. The fix can easy be supplying -Ddocker.image to the command line to customize the value. Instead, he insisted on using -Pdocker-build to activate the ${username}/ozone default value. This is the reason that we need to pass -Pdocker-build for fault injection test when we are not bulding docker image. We can also run fault injection test with: {code}mvn clean verify -Pit -Ddocker.image=${user.name}/ozone:0.5.0-SNAPSHOT{code} I think this double defaults are not intuitive, and a single default apache/ozone:${project.version} can make user experience much better, this allows user to do the development without having to specify the docker image name. SANPSHOT string in docker image tag is enough to determine if it is a local image. Kubernetes developer can configure -Ddocker.image in settings.xml to ensure that they can customize docker image without making ${user.name}/ozone:0.5.0-SNAPSHOT as mandatory that leading to mandatory -Pdocker-build flag or -Ddocker.image= flags for non-Kubernetes developer. Unless [~elek] can agree this is required change, otherwise it would be hard to clean up messy maven code that were forced in HDDS-1667. {quote}There is quite a bit of duplication of the configuration files and YAML files. Do you think there is a way to reduce the duplication?{quote} It is possible to clean up the duplications using docker compose inheritance. I will add to my next patch. Thank you for the review. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1778) Fix existing blockade tests
[ https://issues.apache.org/jira/browse/HDDS-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882250#comment-16882250 ] Hudson commented on HDDS-1778: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16881 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16881/]) HDDS-1778. Fix existing blockade tests. (#1068) (arp7: rev efb916457fc5af868cb7003ee99e0ce3a050a4d2) * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/client.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_scm_isolation.py * (delete) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/blockadeUtils/blockade.py * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/blockade.py * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/constants.py * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/exceptions.py * (delete) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/util.py * (edit) hadoop-ozone/dist/src/main/compose/ozoneblockade/docker-config * (delete) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/blockadeUtils/__init__.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_mixed_failure_two_nodes.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_client_failure.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/cluster.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_mixed_failure.py * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/container.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_flaky.py * (add) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/ozone/util.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_datanode_isolation.py * (edit) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/test_blockade_mixed_failure_three_nodes_isolate.py * (delete) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/clusterUtils/__init__.py * (delete) hadoop-ozone/fault-injection-test/network-tests/src/test/blockade/clusterUtils/cluster_utils.py > Fix existing blockade tests > --- > > Key: HDDS-1778 > URL: https://issues.apache.org/jira/browse/HDDS-1778 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.4.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This jira is to track and fix existing blockade test cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1778) Fix existing blockade tests
[ https://issues.apache.org/jira/browse/HDDS-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1778: Resolution: Fixed Fix Version/s: 0.4.1 Target Version/s: (was: 0.4.1) Status: Resolved (was: Patch Available) Committed this to trunk. Thanks for the contribution [~nandakumar131]! > Fix existing blockade tests > --- > > Key: HDDS-1778 > URL: https://issues.apache.org/jira/browse/HDDS-1778 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.4.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This jira is to track and fix existing blockade test cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1778) Fix existing blockade tests
[ https://issues.apache.org/jira/browse/HDDS-1778?focusedWorklogId=274947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-274947 ] ASF GitHub Bot logged work on HDDS-1778: Author: ASF GitHub Bot Created on: 10/Jul/19 16:44 Start Date: 10/Jul/19 16:44 Worklog Time Spent: 10m Work Description: arp7 commented on pull request #1068: HDDS-1778. Fix existing blockade tests. URL: https://github.com/apache/hadoop/pull/1068 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 274947) Time Spent: 0.5h (was: 20m) > Fix existing blockade tests > --- > > Key: HDDS-1778 > URL: https://issues.apache.org/jira/browse/HDDS-1778 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.4.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This jira is to track and fix existing blockade test cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1780) TestFailureHandlingByClient tests are flaky
[ https://issues.apache.org/jira/browse/HDDS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDDS-1780: -- Status: Patch Available (was: Open) > TestFailureHandlingByClient tests are flaky > --- > > Key: HDDS-1780 > URL: https://issues.apache.org/jira/browse/HDDS-1780 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The tests seem to fail bcoz , when the datanode goes down with stale node > interval being set to a low value, containers may get closed early and client > writes might fail with closed container exception rather than pipeline > failure/Timeout exceptions as excepted in the tests. The fix made here is to > tune the stale node interval. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882234#comment-16882234 ] Erik Krogen commented on HDFS-12748: +1 on branch-3.1 backport > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1777) JVM crash while shutting down Ozone datanode in ShutdownHook
[ https://issues.apache.org/jira/browse/HDDS-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882228#comment-16882228 ] Mukul Kumar Singh commented on HDDS-1777: - I was able to reproduce this after running miniOzoneChaosCluster. However this doesnot reproduce everytime. I have seen it only once till now out of close to ~100+ runs. > JVM crash while shutting down Ozone datanode in ShutdownHook > > > Key: HDDS-1777 > URL: https://issues.apache.org/jira/browse/HDDS-1777 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster > Attachments: hs_err_pid1459.log > > > JVM crash while shutting down Ozone datanode in ShutdownHook with the > following exception. > {code} > Stack: [0x70008791,0x700087a1], sp=0x700087a0db20, free > space=1014k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > C [libsystem_platform.dylib+0x1d09] _platform_memmove$VARIANT$Haswell+0x29 > C [libzip.dylib+0x3399] newEntry+0x65b > C [libzip.dylib+0x352d] ZIP_GetEntry2+0xd4 > C [libzip.dylib+0x2238] Java_java_util_zip_ZipFile_getEntry+0xcf > J 108 java.util.zip.ZipFile.getEntry(J[BZ)J (0 bytes) @ 0x0001087d23ce > [0x0001087d2300+0xce] > J 4302 C2 > java.util.jar.JarFile.getEntry(Ljava/lang/String;)Ljava/util/zip/ZipEntry; > (22 bytes) @ 0x000108d659e8 [0x000108d65660+0x388] > J 4583 C2 > sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource; > (85 bytes) @ 0x000108954b24 [0x000108954aa0+0x84] > J 25559 C2 java.net.URLClassLoader$2.run()Ljava/lang/Object; (5 bytes) @ > 0x00010c2c04c8 [0x00010c2c0380+0x148] > v ~StubRoutines::call_stub > V [libjvm.dylib+0x2ef1f6] > V [libjvm.dylib+0x34fb24] > J 4197 > java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object; > (0 bytes) @ 0x000108e36055 [0x000108e35f80+0xd5] > J 25557 C2 > java.net.URLClassLoader.findResource(Ljava/lang/String;)Ljava/net/URL; (37 > bytes) @ 0x000109c8505c [0x000109c84fc0+0x9c] > J 25556 C2 > java.lang.ClassLoader.getResource(Ljava/lang/String;)Ljava/net/URL; (36 > bytes) @ 0x00010c2bb984 [0x00010c2bb640+0x344] > j > org.apache.hadoop.conf.Configuration.getResource(Ljava/lang/String;)Ljava/net/URL;+5 > j > org.apache.hadoop.conf.Configuration.getStreamReader(Lorg/apache/hadoop/conf/Configuration$Resource;Z)Lorg/codehaus/stax2/XMLStreamReader2;+51 > J 7480 C1 > org.apache.hadoop.conf.Configuration.loadResource(Ljava/util/Properties;Lorg/apache/hadoop/conf/Configuration$Resource;Z)Lorg/apache/hadoop/conf/Configuration$Resource; > (322 bytes) @ 0x000109964bf4 [0x000109964700+0x4f4] > j > org.apache.hadoop.conf.Configuration.loadResources(Ljava/util/Properties;Ljava/util/ArrayList;Z)V+50 > J 8094 C2 > org.apache.hadoop.conf.Configuration.getProps()Ljava/util/Properties; (162 > bytes) @ 0x000109af1fc0 [0x000109af1d40+0x280] > J 15086 C2 > org.apache.hadoop.conf.Configuration.get(Ljava/lang/String;)Ljava/lang/String; > (64 bytes) @ 0x00010ae9ee78 [0x00010ae9eb20+0x358] > J 21716 C1 > org.apache.hadoop.conf.Configuration.getTimeDuration(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;)J > (25 bytes) @ 0x00010b8ab4d4 [0x00010b8ab3c0+0x114] > j > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(Lorg/apache/hadoop/conf/Configuration;)J+9 > j > org.apache.hadoop.util.ShutdownHookManager$HookEntry.(Ljava/lang/Runnable;I)V+10 > j > org.apache.hadoop.util.ShutdownHookManager.removeShutdownHook(Ljava/lang/Runnable;)Z+30 > j org.apache.hadoop.ozone.container.common.volume.VolumeSet.shutdown()V+22 > j org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop()V+43 > j > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close()V+159 > j > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon()V+25 > j org.apache.hadoop.ozone.HddsDatanodeService.stop()V+101 > j org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartHddsDatanode(IZ)V+15 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster.shutdownNodes()V+103 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster.fail()V+48 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster$$Lambda$507.run()V+4 > J 22514 C2 > java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (14 > bytes) @ 0x00010ba89fec [0x00010ba89fa0+0x4c] > J 23026 C1 java.util.concurrent.FutureTask.runAndReset()Z (128 bytes) @ > 0x00010bd067ec [0x00010bd06580+0x26c] > J 22790 C2 >
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882223#comment-16882223 ] Íñigo Goiri commented on HDFS-14593: My only concern now is the fact that we are going to the backend (say ZooKeeper) twice, right? Is there a way we can just do the get once, and not having to go a second time? > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1771) Add slow IO disk test to fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882218#comment-16882218 ] Eric Yang commented on HDDS-1771: - {quote}Thanks to work on this [~eyang]. Fix me if I am wrong, but it tests if the ozone can be started on a very slow disk.{quote} This test show the timing profile that rocksdb would behave under stress. Here is an example of the stack trace when rocksdb can not commit to disk in a timely fashion: {code:java} 2019-07-09 17:15:37,499 ERROR io.BlockOutputStreamEntryPool (BlockOutputStreamEntryPool.java:allocateBlockIfNeeded(299)) - Try to allocate more blocks for write failed, already allocated 0 blocks for this write. INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks. Requested 1 blocks at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:720) at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:752) at org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateNewBlock(BlockOutputStreamEntryPool.java:248) at org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateBlockIfNeeded(BlockOutputStreamEntryPool.java:296) at org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:201) at org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:193) at org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:46) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.FilterOutputStream.write(FilterOutputStream.java:97) at org.apache.hadoop.ozone.ITDiskHang.testUpload(ITDiskHang.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) at org.junit.runners.Suite.runChild(Suite.java:127) at org.junit.runners.Suite.runChild(Suite.java:26) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} {quote}I think the real goal is to test if random read operations are slowed down. (eg. 1 read from every 100 read is significant slower).{quote} Quoting myself from [this comment|https://issues.apache.org/jira/browse/HDDS-1773?focusedCommentId=16882206=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16882206], buffers will mask the slow down operations. It is also difficult to collate the actual nano second that throttle occurs to
[jira] [Commented] (HDFS-14637) Namenode may not replicate blocks to meet the policy after enabling upgradeDomain
[ https://issues.apache.org/jira/browse/HDFS-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882215#comment-16882215 ] Hadoop QA commented on HDFS-14637: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 8m 8s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 49s{color} | {color:red} hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra in trunk has 5 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 21s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 10s{color} | {color:orange} root: The patch generated 3 new + 117 unchanged - 1 fixed = 120 total (was 118) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 41s{color} | {color:red} hadoop-dynamometer-infra in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 53s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}205m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncCheckerTimeout | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-14637 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974168/HDFS-14637.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3fb1dbe8e653 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64
[jira] [Commented] (HDDS-1777) JVM crash while shutting down Ozone datanode in ShutdownHook
[ https://issues.apache.org/jira/browse/HDDS-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882212#comment-16882212 ] Anu Engineer commented on HDDS-1777: Thanks for filing this, any idea how we can reproduce this ? > JVM crash while shutting down Ozone datanode in ShutdownHook > > > Key: HDDS-1777 > URL: https://issues.apache.org/jira/browse/HDDS-1777 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster > Attachments: hs_err_pid1459.log > > > JVM crash while shutting down Ozone datanode in ShutdownHook with the > following exception. > {code} > Stack: [0x70008791,0x700087a1], sp=0x700087a0db20, free > space=1014k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > C [libsystem_platform.dylib+0x1d09] _platform_memmove$VARIANT$Haswell+0x29 > C [libzip.dylib+0x3399] newEntry+0x65b > C [libzip.dylib+0x352d] ZIP_GetEntry2+0xd4 > C [libzip.dylib+0x2238] Java_java_util_zip_ZipFile_getEntry+0xcf > J 108 java.util.zip.ZipFile.getEntry(J[BZ)J (0 bytes) @ 0x0001087d23ce > [0x0001087d2300+0xce] > J 4302 C2 > java.util.jar.JarFile.getEntry(Ljava/lang/String;)Ljava/util/zip/ZipEntry; > (22 bytes) @ 0x000108d659e8 [0x000108d65660+0x388] > J 4583 C2 > sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource; > (85 bytes) @ 0x000108954b24 [0x000108954aa0+0x84] > J 25559 C2 java.net.URLClassLoader$2.run()Ljava/lang/Object; (5 bytes) @ > 0x00010c2c04c8 [0x00010c2c0380+0x148] > v ~StubRoutines::call_stub > V [libjvm.dylib+0x2ef1f6] > V [libjvm.dylib+0x34fb24] > J 4197 > java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object; > (0 bytes) @ 0x000108e36055 [0x000108e35f80+0xd5] > J 25557 C2 > java.net.URLClassLoader.findResource(Ljava/lang/String;)Ljava/net/URL; (37 > bytes) @ 0x000109c8505c [0x000109c84fc0+0x9c] > J 25556 C2 > java.lang.ClassLoader.getResource(Ljava/lang/String;)Ljava/net/URL; (36 > bytes) @ 0x00010c2bb984 [0x00010c2bb640+0x344] > j > org.apache.hadoop.conf.Configuration.getResource(Ljava/lang/String;)Ljava/net/URL;+5 > j > org.apache.hadoop.conf.Configuration.getStreamReader(Lorg/apache/hadoop/conf/Configuration$Resource;Z)Lorg/codehaus/stax2/XMLStreamReader2;+51 > J 7480 C1 > org.apache.hadoop.conf.Configuration.loadResource(Ljava/util/Properties;Lorg/apache/hadoop/conf/Configuration$Resource;Z)Lorg/apache/hadoop/conf/Configuration$Resource; > (322 bytes) @ 0x000109964bf4 [0x000109964700+0x4f4] > j > org.apache.hadoop.conf.Configuration.loadResources(Ljava/util/Properties;Ljava/util/ArrayList;Z)V+50 > J 8094 C2 > org.apache.hadoop.conf.Configuration.getProps()Ljava/util/Properties; (162 > bytes) @ 0x000109af1fc0 [0x000109af1d40+0x280] > J 15086 C2 > org.apache.hadoop.conf.Configuration.get(Ljava/lang/String;)Ljava/lang/String; > (64 bytes) @ 0x00010ae9ee78 [0x00010ae9eb20+0x358] > J 21716 C1 > org.apache.hadoop.conf.Configuration.getTimeDuration(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;)J > (25 bytes) @ 0x00010b8ab4d4 [0x00010b8ab3c0+0x114] > j > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(Lorg/apache/hadoop/conf/Configuration;)J+9 > j > org.apache.hadoop.util.ShutdownHookManager$HookEntry.(Ljava/lang/Runnable;I)V+10 > j > org.apache.hadoop.util.ShutdownHookManager.removeShutdownHook(Ljava/lang/Runnable;)Z+30 > j org.apache.hadoop.ozone.container.common.volume.VolumeSet.shutdown()V+22 > j org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop()V+43 > j > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close()V+159 > j > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon()V+25 > j org.apache.hadoop.ozone.HddsDatanodeService.stop()V+101 > j org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartHddsDatanode(IZ)V+15 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster.shutdownNodes()V+103 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster.fail()V+48 > j org.apache.hadoop.ozone.MiniOzoneChaosCluster$$Lambda$507.run()V+4 > J 22514 C2 > java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (14 > bytes) @ 0x00010ba89fec [0x00010ba89fa0+0x4c] > J 23026 C1 java.util.concurrent.FutureTask.runAndReset()Z (128 bytes) @ > 0x00010bd067ec [0x00010bd06580+0x26c] > J 22790 C2 > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V > (59 bytes) @ 0x00010bdabd8c [0x00010bdabb20+0x26c] >
[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor
[ https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882213#comment-16882213 ] Íñigo Goiri commented on HDFS-12703: The unit test ran correctly: https://builds.apache.org/job/PreCommit-HDFS-Build/27191/testReport/org.apache.hadoop.hdfs/TestDecommission/testPendingNodeButDecommissioned/ +1 on [^HDFS-12703.013.patch]. I assume [~sodonnell] review stands. I'll give a day or so for [~xuel1] to do a final check before committing. > Exceptions are fatal to decommissioning monitor > --- > > Key: HDFS-12703 > URL: https://issues.apache.org/jira/browse/HDFS-12703 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: He Xiaoqiao >Priority: Critical > Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, > HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, > HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, > HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, > HDFS-12703.012.patch, HDFS-12703.013.patch > > > The {{DecommissionManager.Monitor}} runs as an executor scheduled task. If > an exception occurs, all decommissioning ceases until the NN is restarted. > Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the > task encounters an exception, subsequent executions are suppressed*. The > monitor thread is alive but blocked waiting for an executor task that will > never come. The code currently disposes of the future so the actual > exception that aborted the task is gone. > Failover is insufficient since the task is also likely dead on the standby. > Replication queue init after the transition to active will fix the under > replication of blocks on currently decommissioning nodes but future nodes > never decommission. The standby must be bounced prior to failover – and > hopefully the error condition does not reoccur. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org