[jira] [Work logged] (HDDS-1569) Add ability to SCM for creating multiple pipelines with same datanode
[ https://issues.apache.org/jira/browse/HDDS-1569?focusedWorklogId=328363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328363 ] ASF GitHub Bot logged work on HDDS-1569: Author: ASF GitHub Bot Created on: 15/Oct/19 06:46 Start Date: 15/Oct/19 06:46 Worklog Time Spent: 10m Work Description: timmylicheng commented on pull request #13: HDDS-1569 Support creating multiple pipelines with same datanode URL: https://github.com/apache/hadoop-ozone/pull/13 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328363) Time Spent: 8h (was: 7h 50m) > Add ability to SCM for creating multiple pipelines with same datanode > - > > Key: HDDS-1569 > URL: https://issues.apache.org/jira/browse/HDDS-1569 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Siddharth Wagle >Assignee: Li Cheng >Priority: Major > Labels: pull-request-available > Time Spent: 8h > Remaining Estimate: 0h > > - Refactor _RatisPipelineProvider.create()_ to be able to create pipelines > with datanodes that are not a part of sufficient pipelines > - Define soft and hard upper bounds for pipeline membership > - Create SCMAllocationManager that can be leveraged to get a candidate set of > datanodes based on placement policies > - Add the datanodes to internal datastructures -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2307) ContextFactory.java contains Windows '^M" at end of each line
Sammi Chen created HDDS-2307: Summary: ContextFactory.java contains Windows '^M" at end of each line Key: HDDS-2307 URL: https://issues.apache.org/jira/browse/HDDS-2307 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Sammi Chen Covert the file to Unix format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands
[ https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=328346&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328346 ] ASF GitHub Bot logged work on HDDS-2034: Author: ASF GitHub Bot Created on: 15/Oct/19 06:06 Start Date: 15/Oct/19 06:06 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1650: HDDS-2034. Async RATIS pipeline creation and destroy through datanode… URL: https://github.com/apache/hadoop/pull/1650#issuecomment-542052548 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 41 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 2 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 16 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 67 | Maven dependency ordering for branch | | -1 | mvninstall | 37 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 40 | hadoop-ozone in trunk failed. | | -1 | compile | 20 | hadoop-hdds in trunk failed. | | -1 | compile | 16 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 60 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 861 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 23 | hadoop-hdds in trunk failed. | | -1 | javadoc | 20 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 964 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 34 | hadoop-hdds in trunk failed. | | -1 | findbugs | 21 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 29 | Maven dependency ordering for patch | | -1 | mvninstall | 34 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 36 | hadoop-ozone in the patch failed. | | -1 | compile | 24 | hadoop-hdds in the patch failed. | | -1 | compile | 19 | hadoop-ozone in the patch failed. | | -1 | cc | 24 | hadoop-hdds in the patch failed. | | -1 | cc | 19 | hadoop-ozone in the patch failed. | | -1 | javac | 24 | hadoop-hdds in the patch failed. | | -1 | javac | 19 | hadoop-ozone in the patch failed. | | +1 | checkstyle | 57 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | xml | 2 | The patch has no ill-formed XML file. | | +1 | shadedclient | 717 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 23 | hadoop-hdds in the patch failed. | | -1 | javadoc | 20 | hadoop-ozone in the patch failed. | | -1 | findbugs | 31 | hadoop-hdds in the patch failed. | | -1 | findbugs | 21 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 28 | hadoop-hdds in the patch failed. | | -1 | unit | 27 | hadoop-ozone in the patch failed. | | +1 | asflicense | 34 | The patch does not generate ASF License warnings. | | | | 2480 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.3 Server=19.03.3 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1650 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc | | uname | Linux 9ab173466796 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 336abbd | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1650/2/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/j
[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp
[ https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951613#comment-16951613 ] Ayush Saxena commented on HDFS-14802: - Thanx [~ferhui] for the report. IMO We shouldn't allow deletion of any protected directory in any way. and I agree to restrict rename for protected directories, to prevent this hacky way of deleting the protected directories. But to my knowledge this would be an incompatible change too. Since this would be change in behavior of rename API. One other way that I can think of, is at the time of rename, maybe we forward the behavior preventing deletion so that the renamed dir also can't be deleted, but not sure what is the best way to do so. Let me check if we can get some more opinions/help here. [~elgoiri] [~vinayakumarb] [~ste...@apache.org] [~aajisaka] any opinions here... > The feature of protect directories should be used in RenameOp > - > > Key: HDFS-14802 > URL: https://issues.apache.org/jira/browse/HDFS-14802 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, > HDFS-14802.003.patch > > > Now we could set fs.protected.directories to prevent users from deleting > important directories. But users can delete directories around the limitation. > 1. Rename the directories and delete them. > 2. move the directories to trash and namenode will delete them. > So I think we should use the feature of protected directories in RenameOp -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14638) [Dynamometer] Fix scripts to refer to current build structure
[ https://issues.apache.org/jira/browse/HDFS-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951593#comment-16951593 ] Takanobu Asanuma commented on HDFS-14638: - Thanks for filing it. I'd like to work on this jira. > [Dynamometer] Fix scripts to refer to current build structure > - > > Key: HDFS-14638 > URL: https://issues.apache.org/jira/browse/HDFS-14638 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode, test >Reporter: Erik Krogen >Priority: Major > > The scripts within the Dynamometer build dirs all refer to the old > distribution structure with a single {{bin}} directory and a single {{lib}} > directory. We need to update them to refer to the Hadoop-standard layout. > Also as pointed out by [~pingsutw]: > {quote} > Due to dynamometer rename to hadoop-dynamometer in hadoop-tools > but we still use old name of jar inside the scripts > {code} > "$hadoop_cmd" jar "${script_pwd}"/lib/dynamometer-infra-*.jar > org.apache.hadoop.tools.dynamometer.Client "$@" > {code} > We should rename these jar inside the scripts > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14638) [Dynamometer] Fix scripts to refer to current build structure
[ https://issues.apache.org/jira/browse/HDFS-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma reassigned HDFS-14638: --- Assignee: Takanobu Asanuma > [Dynamometer] Fix scripts to refer to current build structure > - > > Key: HDFS-14638 > URL: https://issues.apache.org/jira/browse/HDFS-14638 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode, test >Reporter: Erik Krogen >Assignee: Takanobu Asanuma >Priority: Major > > The scripts within the Dynamometer build dirs all refer to the old > distribution structure with a single {{bin}} directory and a single {{lib}} > directory. We need to update them to refer to the Hadoop-standard layout. > Also as pointed out by [~pingsutw]: > {quote} > Due to dynamometer rename to hadoop-dynamometer in hadoop-tools > but we still use old name of jar inside the scripts > {code} > "$hadoop_cmd" jar "${script_pwd}"/lib/dynamometer-infra-*.jar > org.apache.hadoop.tools.dynamometer.Client "$@" > {code} > We should rename these jar inside the scripts > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14907) [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary
[ https://issues.apache.org/jira/browse/HDFS-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951584#comment-16951584 ] Takanobu Asanuma commented on HDFS-14907: - [~aajisaka] Thanks for your comments, I tried it but it didn't solve the problem. It is no effect on datanode which is started by start-dynamometer-cluster.sh. The wrong jar name is issued by HDFS-14638. > [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary > -- > > Key: HDFS-14907 > URL: https://issues.apache.org/jira/browse/HDFS-14907 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Takanobu Asanuma >Priority: Major > > When executing {{start-dynamometer-cluster.sh}} with Hadoop-3 binary, > datanodes fail to run with the following log and > {{start-dynamometer-cluster.sh}} fails. > {noformat} > LogType:stderr > LogLastModifiedTime:Wed Oct 09 15:03:09 +0900 2019 > LogLength:1386 > LogContents: > Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert > at > org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:299) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:243) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:252) > at > org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982) > at > org.apache.hadoop.hdfs.MiniDFSCluster.determineDfsBaseDir(MiniDFSCluster.java:2972) > at > org.apache.hadoop.hdfs.MiniDFSCluster.formatDataNodeDirs(MiniDFSCluster.java:2834) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(SimulatedDataNodes.java:123) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.main(SimulatedDataNodes.java:88) > Caused by: java.lang.ClassNotFoundException: org.junit.Assert > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 9 more > ./start-component.sh: line 317: kill: (2261) - No such process > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2204) Avoid buffer coping in checksum verification
[ https://issues.apache.org/jira/browse/HDDS-2204?focusedWorklogId=328310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328310 ] ASF GitHub Bot logged work on HDDS-2204: Author: ASF GitHub Bot Created on: 15/Oct/19 04:01 Start Date: 15/Oct/19 04:01 Worklog Time Spent: 10m Work Description: mukul1987 commented on pull request #1593: HDDS-2204. Avoid buffer coping in checksum verification. URL: https://github.com/apache/hadoop/pull/1593 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328310) Time Spent: 1h 10m (was: 1h) > Avoid buffer coping in checksum verification > > > Key: HDDS-2204 > URL: https://issues.apache.org/jira/browse/HDDS-2204 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: o2204_20190930.patch, o2204_20190930b.patch, > o2204_20191001.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In Checksum.verifyChecksum(ByteString, ..), it first converts the ByteString > to a byte array. It lead to an unnecessary buffer coping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2204) Avoid buffer coping in checksum verification
[ https://issues.apache.org/jira/browse/HDDS-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2204. - Fix Version/s: 0.5.0 Resolution: Fixed I have committed this to master. thanks for the conctribution [~szetszwo] and [~shashikant] for the review. > Avoid buffer coping in checksum verification > > > Key: HDDS-2204 > URL: https://issues.apache.org/jira/browse/HDDS-2204 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: o2204_20190930.patch, o2204_20190930b.patch, > o2204_20191001.patch > > Time Spent: 1h > Remaining Estimate: 0h > > In Checksum.verifyChecksum(ByteString, ..), it first converts the ByteString > to a byte array. It lead to an unnecessary buffer coping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2306) Fix TestWatchForCommit failure
Mukul Kumar Singh created HDDS-2306: --- Summary: Fix TestWatchForCommit failure Key: HDDS-2306 URL: https://issues.apache.org/jira/browse/HDDS-2306 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Affects Versions: 0.4.1 Reporter: Mukul Kumar Singh {code} [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 203.385 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestWatchForCommit [ERROR] test2WayCommitForTimeoutException(org.apache.hadoop.ozone.client.rpc.TestWatchForCommit) Time elapsed: 27.093 s <<< ERROR! java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:283) at org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:391) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2220) HddsVolume needs a toString method
[ https://issues.apache.org/jira/browse/HDDS-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2220: - Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > HddsVolume needs a toString method > -- > > Key: HDDS-2220 > URL: https://issues.apache.org/jira/browse/HDDS-2220 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > This is logged to the console of datanodes: > {code:java} > 2019-10-01 11:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > {code} > Without a proper HddsVolume.toString it's hard to say which volume is > checked... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2220) HddsVolume needs a toString method
[ https://issues.apache.org/jira/browse/HDDS-2220?focusedWorklogId=328286&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328286 ] ASF GitHub Bot logged work on HDDS-2220: Author: ASF GitHub Bot Created on: 15/Oct/19 03:17 Start Date: 15/Oct/19 03:17 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #3: HDDS-2220. HddsVolume needs a toString method. URL: https://github.com/apache/hadoop-ozone/pull/3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328286) Time Spent: 50m (was: 40m) > HddsVolume needs a toString method > -- > > Key: HDDS-2220 > URL: https://issues.apache.org/jira/browse/HDDS-2220 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This is logged to the console of datanodes: > {code:java} > 2019-10-01 11:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > {code} > Without a proper HddsVolume.toString it's hard to say which volume is > checked... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2194) Replication of Container fails with "Only closed containers could be exported"
[ https://issues.apache.org/jira/browse/HDDS-2194?focusedWorklogId=328270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328270 ] ASF GitHub Bot logged work on HDDS-2194: Author: ASF GitHub Bot Created on: 15/Oct/19 02:39 Start Date: 15/Oct/19 02:39 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #1632: HDDS-2194. Replication of Container fails with Only closed containers… URL: https://github.com/apache/hadoop/pull/1632 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328270) Time Spent: 2h (was: 1h 50m) > Replication of Container fails with "Only closed containers could be exported" > -- > > Key: HDDS-2194 > URL: https://issues.apache.org/jira/browse/HDDS-2194 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Replication of Container fails with "Only closed containers could be exported" > cc: [~nanda] > {code} > 2019-09-26 15:00:17,640 [grpc-default-executor-13] INFO > replication.GrpcReplicationService (GrpcReplicationService.java:download(57)) > - Streaming container data (37) to other > datanode > Sep 26, 2019 3:00:17 PM > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor run > SEVERE: Exception while executing runnable > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@70e641f2 > java.lang.IllegalStateException: Only closed containers could be exported: > ContainerId=37 > 2019-09-26 15:00:17,644 [grpc-default-executor-17] ERROR > replication.GrpcReplicationClient (GrpcReplicationClient.java:onError(142)) - > Container download was unsuccessfull > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:527) > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNKNOWN > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.exportContainer(KeyValueHandler.java:875) > at > org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:526) > at > org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.exportContainer(ContainerController.java:134) > at > org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:434) > at > org.apache.hadoop.ozone.container.replication.OnDemandContainerReplicationSource.copyData(OnDemandContainerReplicationSource > at > org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) > .java:64) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationService.download(GrpcReplicationService.java:63) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClient > at > org.apache.hadoop.hdds.protocol.datanode.proto.IntraDatanodeProtocolServiceGrpc$MethodHandlers.invoke(IntraDatanodeProtocolSCallListener.java:40) > erviceGrpc.java:217) > at > org.apache.ratis.thirdparty.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:678) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls. > at > org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) > java:171) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClient > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(S
[jira] [Work logged] (HDDS-2278) Run S3 test suite on OM HA cluster
[ https://issues.apache.org/jira/browse/HDDS-2278?focusedWorklogId=328269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328269 ] ASF GitHub Bot logged work on HDDS-2278: Author: ASF GitHub Bot Created on: 15/Oct/19 02:38 Start Date: 15/Oct/19 02:38 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #1643: HDDS-2278. Run S3 test suite on OM HA cluster. URL: https://github.com/apache/hadoop/pull/1643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328269) Time Spent: 50m (was: 40m) > Run S3 test suite on OM HA cluster > -- > > Key: HDDS-2278 > URL: https://issues.apache.org/jira/browse/HDDS-2278 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This will add a new compose setup with 3 OM's and start SCM, S3G, Datanode. > Run the existing test suite against this new docker-compose cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14907) [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary
[ https://issues.apache.org/jira/browse/HDFS-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951500#comment-16951500 ] Akira Ajisaka edited comment on HDFS-14907 at 10/15/19 2:38 AM: It seems that the {{script_pwd}} in the following script is not expected. {code:title=start-dynamometer-cluster.sh} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/.." {code} should be {code} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/../../.." {code} to specify {{share/hadoop/tools/}} to include the libraries to hadoop-tools. In addition, {{dynamometer-infra-\*.jar}} should be {{hadoop-dynamometer-infra-\*.jar}}. was (Author: ajisakaa): It seems that the {{script_pwd}} in the following script is not expected. {code:title=start-dynamometer-cluster.sh} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/.." {code} should be {code} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/../../.." {code} to specify {{share/hadoop/tools/}} to include the libraries to hadoop-tools. In addition, {{dynamometer-infra-*.jar}} should be {{hadoop-dynamometer-infra-*.jar}}. > [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary > -- > > Key: HDFS-14907 > URL: https://issues.apache.org/jira/browse/HDFS-14907 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Takanobu Asanuma >Priority: Major > > When executing {{start-dynamometer-cluster.sh}} with Hadoop-3 binary, > datanodes fail to run with the following log and > {{start-dynamometer-cluster.sh}} fails. > {noformat} > LogType:stderr > LogLastModifiedTime:Wed Oct 09 15:03:09 +0900 2019 > LogLength:1386 > LogContents: > Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert > at > org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:299) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:243) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:252) > at > org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982) > at > org.apache.hadoop.hdfs.MiniDFSCluster.determineDfsBaseDir(MiniDFSCluster.java:2972) > at > org.apache.hadoop.hdfs.MiniDFSCluster.formatDataNodeDirs(MiniDFSCluster.java:2834) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(SimulatedDataNodes.java:123) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.main(SimulatedDataNodes.java:88) > Caused by: java.lang.ClassNotFoundException: org.junit.Assert > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 9 more > ./start-component.sh: line 317: kill: (2261) - No such process > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14907) [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary
[ https://issues.apache.org/jira/browse/HDFS-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951500#comment-16951500 ] Akira Ajisaka commented on HDFS-14907: -- It seems that the {{script_pwd}} in the following script is not expected. {code:title=start-dynamometer-cluster.sh} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/.." {code} should be {code} script_pwd="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/../../.." {code} to specify {{share/hadoop/tools/}} to include the libraries to hadoop-tools. In addition, {{dynamometer-infra-*.jar}} should be {{hadoop-dynamometer-infra-*.jar}}. > [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary > -- > > Key: HDFS-14907 > URL: https://issues.apache.org/jira/browse/HDFS-14907 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Takanobu Asanuma >Priority: Major > > When executing {{start-dynamometer-cluster.sh}} with Hadoop-3 binary, > datanodes fail to run with the following log and > {{start-dynamometer-cluster.sh}} fails. > {noformat} > LogType:stderr > LogLastModifiedTime:Wed Oct 09 15:03:09 +0900 2019 > LogLength:1386 > LogContents: > Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert > at > org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:299) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:243) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:252) > at > org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982) > at > org.apache.hadoop.hdfs.MiniDFSCluster.determineDfsBaseDir(MiniDFSCluster.java:2972) > at > org.apache.hadoop.hdfs.MiniDFSCluster.formatDataNodeDirs(MiniDFSCluster.java:2834) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(SimulatedDataNodes.java:123) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.main(SimulatedDataNodes.java:88) > Caused by: java.lang.ClassNotFoundException: org.junit.Assert > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 9 more > ./start-component.sh: line 317: kill: (2261) - No such process > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2278) Run S3 test suite on OM HA cluster
[ https://issues.apache.org/jira/browse/HDDS-2278?focusedWorklogId=328267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328267 ] ASF GitHub Bot logged work on HDDS-2278: Author: ASF GitHub Bot Created on: 15/Oct/19 02:34 Start Date: 15/Oct/19 02:34 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #27: HDDS-2278. Run S3 test suite on OM HA cluste. URL: https://github.com/apache/hadoop-ozone/pull/27 Add a docker compose OM HA cluster with S3. Run the S3 test suite on the OM HA cluster. https://issues.apache.org/jira/browse/HDDS-2278 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328267) Time Spent: 40m (was: 0.5h) > Run S3 test suite on OM HA cluster > -- > > Key: HDDS-2278 > URL: https://issues.apache.org/jira/browse/HDDS-2278 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This will add a new compose setup with 3 OM's and start SCM, S3G, Datanode. > Run the existing test suite against this new docker-compose cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?focusedWorklogId=328265&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328265 ] ASF GitHub Bot logged work on HDDS-2305: Author: ASF GitHub Bot Created on: 15/Oct/19 02:33 Start Date: 15/Oct/19 02:33 Worklog Time Spent: 10m Work Description: mukul1987 commented on pull request #26: HDDS-2305. Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT). Contributed by Mukul Kumar Singh. URL: https://github.com/apache/hadoop-ozone/pull/26 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328265) Remaining Estimate: 0h Time Spent: 10m > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2305: - Labels: pull-request-available (was: ) > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2305: Summary: Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) (was: Update Ozone to later ratis snapshot.) > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Priority: Major > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2305: --- Assignee: Mukul Kumar Singh > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14907) [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary
[ https://issues.apache.org/jira/browse/HDFS-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951459#comment-16951459 ] Takanobu Asanuma commented on HDFS-14907: - There is a workaround by adding the jar to classpath in hadoop-env.sh. {code:bash} export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HADOOP_TOOLS_HOME}/${HADOOP_TOOLS_LIB_JARS_DIR}/junit-4.12.jar" {code} > [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary > -- > > Key: HDFS-14907 > URL: https://issues.apache.org/jira/browse/HDFS-14907 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Takanobu Asanuma >Priority: Major > > When executing {{start-dynamometer-cluster.sh}} with Hadoop-3 binary, > datanodes fail to run with the following log and > {{start-dynamometer-cluster.sh}} fails. > {noformat} > LogType:stderr > LogLastModifiedTime:Wed Oct 09 15:03:09 +0900 2019 > LogLength:1386 > LogContents: > Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert > at > org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:299) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:243) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:252) > at > org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982) > at > org.apache.hadoop.hdfs.MiniDFSCluster.determineDfsBaseDir(MiniDFSCluster.java:2972) > at > org.apache.hadoop.hdfs.MiniDFSCluster.formatDataNodeDirs(MiniDFSCluster.java:2834) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(SimulatedDataNodes.java:123) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.main(SimulatedDataNodes.java:88) > Caused by: java.lang.ClassNotFoundException: org.junit.Assert > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 9 more > ./start-component.sh: line 317: kill: (2261) - No such process > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14907) [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary
Takanobu Asanuma created HDFS-14907: --- Summary: [Dynamometer] DataNode can't find junit jar when using Hadoop-3 binary Key: HDFS-14907 URL: https://issues.apache.org/jira/browse/HDFS-14907 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Takanobu Asanuma When executing {{start-dynamometer-cluster.sh}} with Hadoop-3 binary, datanodes fail to run with the following log and {{start-dynamometer-cluster.sh}} fails. {noformat} LogType:stderr LogLastModifiedTime:Wed Oct 09 15:03:09 +0900 2019 LogLength:1386 LogContents: Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert at org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:299) at org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:243) at org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:252) at org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982) at org.apache.hadoop.hdfs.MiniDFSCluster.determineDfsBaseDir(MiniDFSCluster.java:2972) at org.apache.hadoop.hdfs.MiniDFSCluster.formatDataNodeDirs(MiniDFSCluster.java:2834) at org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(SimulatedDataNodes.java:123) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.main(SimulatedDataNodes.java:88) Caused by: java.lang.ClassNotFoundException: org.junit.Assert at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 9 more ./start-component.sh: line 317: kill: (2261) - No such process {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp
[ https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951441#comment-16951441 ] Fei Hui commented on HDFS-14802: [~ayushtkn] Could you please take a look? Does it make sense? > The feature of protect directories should be used in RenameOp > - > > Key: HDFS-14802 > URL: https://issues.apache.org/jira/browse/HDFS-14802 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, > HDFS-14802.003.patch > > > Now we could set fs.protected.directories to prevent users from deleting > important directories. But users can delete directories around the limitation. > 1. Rename the directories and delete them. > 2. move the directories to trash and namenode will delete them. > So I think we should use the feature of protected directories in RenameOp -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reduce read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Summary: Write path: Reduce read contention in rocksDB (was: Write path: Reducing read contention in rocksDB) > Write path: Reduce read contention in rocksDB > - > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951383#comment-16951383 ] Jonathan Hung commented on HDFS-14305: -- Unmarking as a 2.10.0 blocker. At the very least, if NNs stay the same, users should be able to upgrade from 2.x -> 2.10 without any incompatibility risk. Cases such as add/remove/reorder NNs should be addressed separately, IMO. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Konstantin Shvachko >Priority: Major > Labels: multi-sbnn, release-blocker > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated HDFS-14305: - Labels: multi-sbnn (was: multi-sbnn release-blocker) > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Konstantin Shvachko >Priority: Major > Labels: multi-sbnn > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951378#comment-16951378 ] Konstantin Shvachko commented on HDFS-14305: ??This patch was committed over my valid technical objection. I hope you will respect that?? Totally respect technical objections. I was under the impression you agreed with my reasoning. But I see I was wrong. Addressing your questions. ??the mitigation for the incompatibility.?? I don't think incompatible changes could be "mitigated". They are not "better or worse", they are unacceptable. For minor versions it is documented, but I would extend it to major versions as well, since this is the reason people now cannot upgrade to 3.x. To this issue. There are different "cases" of overlapping ranges here. # Restarting the same NameNodes on the same binaries and configuration can lead to overlapping ranges. This is the problem that was originally reported here. The idea was to choose an initial serial number randomly within the range designated to current NameNode. But due to an incorrect formula if the random number is negative the initial serial number falls outside the designated range and therefore causes intersection with ranges designated to other NameNodes. My patch v08 fixes just that. # Changing the number of NameNodes on the cluster can cause ranges overlapping. This is not solved in current version. There is a work around mentioned above, but I agree with [~arp] it should be properly solved. It was _partly_ solved by the reverted approach v06 patch, but sacrificed compatibility. # Rolling upgrade from version that does not contain this change to the one that does. No problem for v08, but a problem for v06. # Changing the order of NameNode in the configuration. Not solved by any of the approaches. I think we should prevent all these cases of overlapping ranges. In a compatible way in the next jira. [~arp] would you agree? > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Konstantin Shvachko >Priority: Major > Labels: multi-sbnn, release-blocker > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2298) Fix maven warning about duplicated metrics-core jar
[ https://issues.apache.org/jira/browse/HDDS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao resolved HDDS-2298. -- Fix Version/s: 0.5.0 Resolution: Fixed Thanks [~elek] for the contribution and all for the reviews. The change has been merged. > Fix maven warning about duplicated metrics-core jar > --- > > Key: HDDS-2298 > URL: https://issues.apache.org/jira/browse/HDDS-2298 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Maven build of Ozone is starting with a warning: > {code:java} > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT > [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must > be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ > line 94, column 17 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > {code} > It's better to avoid it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951345#comment-16951345 ] Íñigo Goiri commented on HDFS-14887: Thanks [~hemanthboyina], TestMetricsBase makes much more sense. For making it easier to read, I would create a function like: {code} public static Map getNameserviceStateMap(JSONObject jsonObject) { Map map = new TreeMap<>(); Iterator keys = jsonObject.keys(); while (keys.hasNext()) { String key = (String) keys.next(); JSONObject json = jsonObject.getJSONObject(key); String nsId = json.getString("nameserviceId"); String state = json.getString("state"); map.put(nsId, state); } return map; } {code} Then you can do a much more readable assert like: {code} assertTrue("Cannot find ns0 in map: " + map, map.containsKey("ns0")); assertEquals("OBSERVER", map.get("ns0")); {code} > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch, HDFS-14887.006.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951342#comment-16951342 ] Hadoop QA commented on HDFS-14887: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 50s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 59s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14887 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982991/HDFS-14887.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e3a11d48322e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 336abbd | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28087/testReport/ | | Max. process+thread count | 2723 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28087/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > --
[jira] [Updated] (HDDS-1987) Fix listStatus API
[ https://issues.apache.org/jira/browse/HDDS-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDDS-1987: - Description: This Jira is to fix listStatus API in HA code path. In HA, we have an in-memory cache, where we put the result to in-memory cache and return the response, later it will be picked by double buffer thread and it will flush to disk. So, now when do listStatus, it should use both in-memory cache and rocksdb key table to listStatus in a bucket. was: This Jira is to fix listKeys API in HA code path. In HA, we have an in-memory cache, where we put the result to in-memory cache and return the response, later it will be picked by double buffer thread and it will flush to disk. So, now when do listStatus, it should use both in-memory cache and rocksdb key table to list Status of Keys in a bucket. > Fix listStatus API > -- > > Key: HDDS-1987 > URL: https://issues.apache.org/jira/browse/HDDS-1987 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Siyao Meng >Priority: Major > > This Jira is to fix listStatus API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listStatus, it should use both > in-memory cache and rocksdb key table to listStatus in a bucket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2240) Command line tool for OM Admin
[ https://issues.apache.org/jira/browse/HDDS-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2240: Status: Patch Available (was: Open) > Command line tool for OM Admin > -- > > Key: HDDS-2240 > URL: https://issues.apache.org/jira/browse/HDDS-2240 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > A command line tool (*ozone omha*) to get information related to OM HA. > This Jira proposes to add the _getServiceState_ option for OM HA which lists > all the OMs in the service and their corresponding Ratis server roles > (LEADER/ FOLLOWER). > We can later add more options to this tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1985) Fix listVolumes API
[ https://issues.apache.org/jira/browse/HDDS-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951311#comment-16951311 ] Arpit Agarwal commented on HDDS-1985: - I believe this was going to be resolved as "Won't Fix". [~bharat] can you confirm and resolve? > Fix listVolumes API > --- > > Key: HDDS-1985 > URL: https://issues.apache.org/jira/browse/HDDS-1985 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > > This Jira is to fix lisVolumes API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listVolumes, it should use both > in-memory cache and rocksdb volume table to list volumes for a user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14887: - Attachment: HDFS-14887.006.patch > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch, HDFS-14887.006.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1228) Chunk Scanner Checkpoints
[ https://issues.apache.org/jira/browse/HDDS-1228?focusedWorklogId=328078&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328078 ] ASF GitHub Bot logged work on HDDS-1228: Author: ASF GitHub Bot Created on: 14/Oct/19 19:32 Start Date: 14/Oct/19 19:32 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #1622: HDDS-1228. Chunk Scanner Checkpoints URL: https://github.com/apache/hadoop/pull/1622#discussion_r334630866 ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerDataScanner.java ## @@ -95,14 +97,19 @@ public void runIteration() { while (!stopping && itr.hasNext()) { Container c = itr.next(); if (c.shouldScanData()) { +ContainerData containerData = c.getContainerData(); +long containerId = containerData.getContainerID(); try { + logScanStart(containerData); if (!c.scanData(throttler, canceler)) { metrics.incNumUnHealthyContainers(); -controller.markContainerUnhealthy( -c.getContainerData().getContainerID()); +controller.markContainerUnhealthy(containerId); Review comment: I would avoid this for two reasons: 1. The full scan includes a scan of the metadata, too, and the failure may be due to metadata problem. Eg. if the `.container` file is missing or invalid etc. In that case we cannot update the timestamp in the file. 2. Unhealthy containers are skipped during further iterations, so the timestamp would not make much difference anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328078) Time Spent: 2h 10m (was: 2h) > Chunk Scanner Checkpoints > - > > Key: HDDS-1228 > URL: https://issues.apache.org/jira/browse/HDDS-1228 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Attila Doroszlai >Priority: Critical > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Checkpoint the progress of the chunk verification scanner. > Save the checkpoint persistently to support scanner resume from checkpoint - > after a datanode restart. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14887: - Attachment: HDFS-14887.005.patch > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1228) Chunk Scanner Checkpoints
[ https://issues.apache.org/jira/browse/HDDS-1228?focusedWorklogId=328075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328075 ] ASF GitHub Bot logged work on HDDS-1228: Author: ASF GitHub Bot Created on: 14/Oct/19 19:28 Start Date: 14/Oct/19 19:28 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #1622: HDDS-1228. Chunk Scanner Checkpoints URL: https://github.com/apache/hadoop/pull/1622#discussion_r334629787 ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerData.java ## @@ -89,7 +91,9 @@ private HddsVolume volume; private String checksum; - public static final Charset CHARSET_ENCODING = Charset.forName("UTF-8"); + private Long dataScanTimestamp; Review comment: Thanks for the comments. I will address these and update the pull request in the new repo. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328075) Time Spent: 2h (was: 1h 50m) > Chunk Scanner Checkpoints > - > > Key: HDDS-1228 > URL: https://issues.apache.org/jira/browse/HDDS-1228 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Attila Doroszlai >Priority: Critical > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Checkpoint the progress of the chunk verification scanner. > Save the checkpoint persistently to support scanner resume from checkpoint - > after a datanode restart. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951259#comment-16951259 ] Íñigo Goiri commented on HDFS-14887: That sounds good. > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951255#comment-16951255 ] Surendra Singh Lilhore commented on HDFS-14768: --- Thanks [~gjhkael] for patch. I will review it tomorrow. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes());
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951247#comment-16951247 ] hemanthboyina commented on HDFS-14887: -- how about having the test of [^HDFS-14887.004.patch] in TestMetricsBase > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14886) In NameNode Web UI's Startup Progress page, Loading edits always shows 0 sec
[ https://issues.apache.org/jira/browse/HDFS-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-14886: -- Fix Version/s: 3.2.2 3.1.4 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~hemanthboyina] for contribution. Committed to trunk, branch-3.2, branch-3.1 > In NameNode Web UI's Startup Progress page, Loading edits always shows 0 sec > > > Key: HDFS-14886 > URL: https://issues.apache.org/jira/browse/HDFS-14886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14886.001.patch, HDFS-14886.002.patch, > HDFS-14886.003.patch, HDFS-14886_After.png, HDFS-14886_before.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951233#comment-16951233 ] Íñigo Goiri commented on HDFS-14854: Thanks [~sodonnell] for the update. Regarding the sharing, just one minor one: what are the issues with sharing maxConcurrentTrackedNodes and numBlocksChecked? {{outOfServiceNodeBlocks}} could also potentially just be a Map and internally use TreeMap but I understand it has issues with the API so I'm finw with it. Other than that, I like the new approach. > Create improved decommission monitor implementation > --- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2298) Fix maven warning about duplicated metrics-core jar
[ https://issues.apache.org/jira/browse/HDDS-2298?focusedWorklogId=328038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328038 ] ASF GitHub Bot logged work on HDDS-2298: Author: ASF GitHub Bot Created on: 14/Oct/19 18:25 Start Date: 14/Oct/19 18:25 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #18: HDDS-2298. Fix maven warning about duplicated metrics-core jar URL: https://github.com/apache/hadoop-ozone/pull/18 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328038) Time Spent: 20m (was: 10m) > Fix maven warning about duplicated metrics-core jar > --- > > Key: HDDS-2298 > URL: https://issues.apache.org/jira/browse/HDDS-2298 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Maven build of Ozone is starting with a warning: > {code:java} > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT > [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must > be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ > line 94, column 17 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > {code} > It's better to avoid it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14886) In NameNode Web UI's Startup Progress page, Loading edits always shows 0 sec
[ https://issues.apache.org/jira/browse/HDFS-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951227#comment-16951227 ] Hudson commented on HDFS-14886: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17534 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17534/]) HDFS-14886. In NameNode Web UI's Startup Progress page, Loading edits (surendralilhore: rev 336abbd8737f3dff38f7bdad9721511c711c522b) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java > In NameNode Web UI's Startup Progress page, Loading edits always shows 0 sec > > > Key: HDFS-14886 > URL: https://issues.apache.org/jira/browse/HDFS-14886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14886.001.patch, HDFS-14886.002.patch, > HDFS-14886.003.patch, HDFS-14886_After.png, HDFS-14886_before.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951226#comment-16951226 ] Íñigo Goiri commented on HDFS-14284: I may be wrong, but I think that if we use the FileSystem interface instead of the ClientProtocol, we can use mkdirs and get the actual exception instead of RemoteException which requires checking the message. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951217#comment-16951217 ] Hadoop QA commented on HDFS-14854: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 50s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 11 new + 462 unchanged - 5 fixed = 473 total (was 467) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 4s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14854 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982972/HDFS-14854.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c8b148555e6a 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5cc7873 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28086/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommi
[jira] [Updated] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer
[ https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Ratnavel Subramanian updated HDDS-2181: - Status: Patch Available (was: In Progress) > Ozone Manager should send correct ACL type in ACL requests to Authorizer > > > Key: HDDS-2181 > URL: https://issues.apache.org/jira/browse/HDDS-2181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Time Spent: 10h 50m > Remaining Estimate: 0h > > Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete > and bucket create operation. Fix the acl type in all requests to the > authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2255) Improve Acl Handler Messages
[ https://issues.apache.org/jira/browse/HDDS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-2255: -- Assignee: YiSheng Lien > Improve Acl Handler Messages > > > Key: HDDS-2255 > URL: https://issues.apache.org/jira/browse/HDDS-2255 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: om >Reporter: Hanisha Koneru >Assignee: YiSheng Lien >Priority: Minor > Labels: newbie > > In Add/Remove/Set Acl Key/Bucket/Volume Handlers, we print a message about > whether the operation was successful or not. If we are trying to add an ACL > which is already existing, we convey the message that the operation failed. > It would be better if the message conveyed more clearly why the operation > failed i.e. the ACL already exists. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1868) Ozone pipelines should be marked as ready only after the leader election is complete
[ https://issues.apache.org/jira/browse/HDDS-1868?focusedWorklogId=328006&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328006 ] ASF GitHub Bot logged work on HDDS-1868: Author: ASF GitHub Bot Created on: 14/Oct/19 17:37 Start Date: 14/Oct/19 17:37 Worklog Time Spent: 10m Work Description: swagle commented on pull request #23: HDDS-1868. Ozone pipelines should be marked as ready only after the leader election is complete. URL: https://github.com/apache/hadoop-ozone/pull/23 Ozone pipeline on create and restart, start in allocated state. They are moved into open state after all the pipeline have reported to it. However, this potentially can lead into an issue where the pipeline is still not ready to accept any incoming IO operations. The pipelines should be marked as ready only after the leader election is complete and leader is ready to accept incoming IO. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328006) Time Spent: 3.5h (was: 3h 20m) > Ozone pipelines should be marked as ready only after the leader election is > complete > > > Key: HDDS-1868 > URL: https://issues.apache.org/jira/browse/HDDS-1868 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode, SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: HDDS-1868.01.patch, HDDS-1868.02.patch, > HDDS-1868.03.patch, HDDS-1868.04.patch, HDDS-1868.05.patch, HDDS-1868.06.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > Ozone pipeline on create and restart, start in allocated state. They are > moved into open state after all the pipeline have reported to it. However, > this potentially can lead into an issue where the pipeline is still not ready > to accept any incoming IO operations. > The pipelines should be marked as ready only after the leader election is > complete and leader is ready to accept incoming IO. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1868) Ozone pipelines should be marked as ready only after the leader election is complete
[ https://issues.apache.org/jira/browse/HDDS-1868?focusedWorklogId=328004&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328004 ] ASF GitHub Bot logged work on HDDS-1868: Author: ASF GitHub Bot Created on: 14/Oct/19 17:35 Start Date: 14/Oct/19 17:35 Worklog Time Spent: 10m Work Description: swagle commented on pull request #22: HDDS-1868. Ozone pipelines should be marked as ready only after the leader election is complete. URL: https://github.com/apache/hadoop-ozone/pull/22 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328004) Time Spent: 3h 20m (was: 3h 10m) > Ozone pipelines should be marked as ready only after the leader election is > complete > > > Key: HDDS-1868 > URL: https://issues.apache.org/jira/browse/HDDS-1868 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode, SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: HDDS-1868.01.patch, HDDS-1868.02.patch, > HDDS-1868.03.patch, HDDS-1868.04.patch, HDDS-1868.05.patch, HDDS-1868.06.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Ozone pipeline on create and restart, start in allocated state. They are > moved into open state after all the pipeline have reported to it. However, > this potentially can lead into an issue where the pipeline is still not ready > to accept any incoming IO operations. > The pipelines should be marked as ready only after the leader election is > complete and leader is ready to accept incoming IO. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1868) Ozone pipelines should be marked as ready only after the leader election is complete
[ https://issues.apache.org/jira/browse/HDDS-1868?focusedWorklogId=327996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327996 ] ASF GitHub Bot logged work on HDDS-1868: Author: ASF GitHub Bot Created on: 14/Oct/19 17:34 Start Date: 14/Oct/19 17:34 Worklog Time Spent: 10m Work Description: swagle commented on pull request #22: HDDS-1868. Ozone pipelines should be marked as ready only after the leader election is complete. URL: https://github.com/apache/hadoop-ozone/pull/22 Ozone pipeline on create and restart, start in allocated state. They are moved into open state after all the pipeline have reported to it. However, this potentially can lead into an issue where the pipeline is still not ready to accept any incoming IO operations. The pipelines should be marked as ready only after the leader election is complete and leader is ready to accept incoming IO. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327996) Time Spent: 3h 10m (was: 3h) > Ozone pipelines should be marked as ready only after the leader election is > complete > > > Key: HDDS-1868 > URL: https://issues.apache.org/jira/browse/HDDS-1868 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode, SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: HDDS-1868.01.patch, HDDS-1868.02.patch, > HDDS-1868.03.patch, HDDS-1868.04.patch, HDDS-1868.05.patch, HDDS-1868.06.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Ozone pipeline on create and restart, start in allocated state. They are > moved into open state after all the pipeline have reported to it. However, > this potentially can lead into an issue where the pipeline is still not ready > to accept any incoming IO operations. > The pipelines should be marked as ready only after the leader election is > complete and leader is ready to accept incoming IO. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-2301: - Assignee: Nanda kumar > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2305) Update Ozone to later ratis snapshot.
Mukul Kumar Singh created HDDS-2305: --- Summary: Update Ozone to later ratis snapshot. Key: HDDS-2305 URL: https://issues.apache.org/jira/browse/HDDS-2305 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Mukul Kumar Singh This jira will update ozone to latest ratis snapshot. for commit corresponding to {code} commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, origin/master, origin/HEAD) Author: Tsz Wo Nicholas Sze Date: Fri Oct 11 16:35:38 2019 +0800 RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed by Lokesh Jain {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2304) ozone token cli output can be improved.
Xiaoyu Yao created HDDS-2304: Summary: ozone token cli output can be improved. Key: HDDS-2304 URL: https://issues.apache.org/jira/browse/HDDS-2304 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Some output does not start a new line at the end. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14856) Add ability to import file ACLs from remote store
[ https://issues.apache.org/jira/browse/HDFS-14856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951157#comment-16951157 ] Hudson commented on HDFS-14856: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17533 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17533/]) HDFS-14856. Fetch file ACLs while mounting external store. (#1478) (virajith: rev fabd41fa480303f86bfe7b6ae0277bc0b6015f80) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * (edit) hadoop-tools/hadoop-fs2img/src/test/java/org/apache/hadoop/hdfs/server/namenode/RandomTreeWalk.java * (edit) hadoop-tools/hadoop-fs2img/src/main/java/org/apache/hadoop/hdfs/server/namenode/UGIResolver.java * (edit) hadoop-tools/hadoop-fs2img/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSingleUGIResolver.java * (edit) hadoop-tools/hadoop-fs2img/src/main/java/org/apache/hadoop/hdfs/server/namenode/TreePath.java * (edit) hadoop-tools/hadoop-fs2img/src/main/java/org/apache/hadoop/hdfs/server/namenode/TreeWalk.java * (add) hadoop-tools/hadoop-fs2img/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSTreeWalk.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * (edit) hadoop-tools/hadoop-fs2img/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSTreeWalk.java * (edit) hadoop-tools/hadoop-fs2img/src/main/java/org/apache/hadoop/hdfs/server/namenode/SingleUGIResolver.java > Add ability to import file ACLs from remote store > - > > Key: HDFS-14856 > URL: https://issues.apache.org/jira/browse/HDFS-14856 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ashvin Agrawal >Assignee: Ashvin Agrawal >Priority: Major > > Provided storage (HDFS-9806) allows data on external storage systems to > seamlessly appear as files on HDFS. However, in the implementation today, the > external store scanner, {{FsTreeWalk,}} ignores any ACLs on the data. In a > secure HDFS setup where external storage system and HDFS belong to the same > security domain, uniform enforcement of the authorization policies may be > desired. This task aims to extend the ability of the external store scanner > to support this use case. When configured, the scanner should attempt to > fetch ACLs and provide it to the consumer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2303) [IGNORE] Test Jira
[ https://issues.apache.org/jira/browse/HDDS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951153#comment-16951153 ] Arpit Agarwal edited comment on HDDS-2303 at 10/14/19 4:57 PM: --- Test comment (edited). was (Author: arpitagarwal): Test comment. > [IGNORE] Test Jira > -- > > Key: HDDS-2303 > URL: https://issues.apache.org/jira/browse/HDDS-2303 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Arpit Agarwal >Priority: Major > > Ignore this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2303) [IGNORE] Test Jira
[ https://issues.apache.org/jira/browse/HDDS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951153#comment-16951153 ] Arpit Agarwal commented on HDDS-2303: - Test comment. > [IGNORE] Test Jira > -- > > Key: HDDS-2303 > URL: https://issues.apache.org/jira/browse/HDDS-2303 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Arpit Agarwal >Priority: Major > > Ignore this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2303) [IGNORE] Test Jira
Arpit Agarwal created HDDS-2303: --- Summary: [IGNORE] Test Jira Key: HDDS-2303 URL: https://issues.apache.org/jira/browse/HDDS-2303 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Arpit Agarwal Ignore this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14856) Add ability to import file ACLs from remote store
[ https://issues.apache.org/jira/browse/HDFS-14856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-14856: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Add ability to import file ACLs from remote store > - > > Key: HDFS-14856 > URL: https://issues.apache.org/jira/browse/HDFS-14856 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ashvin Agrawal >Assignee: Ashvin Agrawal >Priority: Major > > Provided storage (HDFS-9806) allows data on external storage systems to > seamlessly appear as files on HDFS. However, in the implementation today, the > external store scanner, {{FsTreeWalk,}} ignores any ACLs on the data. In a > secure HDFS setup where external storage system and HDFS belong to the same > security domain, uniform enforcement of the authorization policies may be > desired. This task aims to extend the ability of the external store scanner > to support this use case. When configured, the scanner should attempt to > fetch ACLs and provide it to the consumer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2288) Delete hadoop-ozone and hadoop-hdds subprojects from apache trunk
[ https://issues.apache.org/jira/browse/HDDS-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek reassigned HDDS-2288: - Assignee: (was: Marton Elek) > Delete hadoop-ozone and hadoop-hdds subprojects from apache trunk > - > > Key: HDDS-2288 > URL: https://issues.apache.org/jira/browse/HDDS-2288 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Priority: Major > > As described in the HDDS-2287 ozone/hdds sources are moving to the > apache/hadoop-ozone git repository. > All the remaining ozone/hdds files can be removed from trunk (including hdds > profile in main pom.xml) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951070#comment-16951070 ] Stephen O'Donnell commented on HDFS-14854: -- I have just uploaded another patch. This uses ReflectionUtils to create the monitor instance and adds DatanodeAdminMonitorBase which the two monitor classes extend to allow some code to be shared. I still want to make as little changes as possible to the original decommission class, so I have not gone to a great effort to avoid any duplicated code between the two monitors, but I have pulled some obvious stuff into the Base class. I also have a long term view that the "DefaultMonitor" should be deprecated if we get the BackoffMonitor tried, tested and proven to be an improvement. I think the 008 patch addresses all the review comments so far. > Create improved decommission monitor implementation > --- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-14854: - Attachment: HDFS-14854.008.patch > Create improved decommission monitor implementation > --- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2290) Rename pom.ozone.xml to pom.xml
[ https://issues.apache.org/jira/browse/HDDS-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2290. --- Fix Version/s: 0.5.0 Resolution: Fixed > Rename pom.ozone.xml to pom.xml > --- > > Key: HDDS-2290 > URL: https://issues.apache.org/jira/browse/HDDS-2290 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Since we have a separate git repository for Ozone now, we should rename > {{pom.ozone.xml}} to {{pom.xml}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1569) Add ability to SCM for creating multiple pipelines with same datanode
[ https://issues.apache.org/jira/browse/HDDS-1569?focusedWorklogId=327829&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327829 ] ASF GitHub Bot logged work on HDDS-1569: Author: ASF GitHub Bot Created on: 14/Oct/19 14:09 Start Date: 14/Oct/19 14:09 Worklog Time Spent: 10m Work Description: elek commented on issue #1431: HDDS-1569 Support creating multiple pipelines with same datanode URL: https://github.com/apache/hadoop/pull/1431#issuecomment-541700703 > CI build failed on new PR: apache/hadoop-ozone#13. Could you please take a look? Sorry, this is may fault. One commit is missing from all of the created pr branches. (restore the README.txt) 1. You can rebase to the latest master (the safest choice) 2. OR Locally you can create an empty README.txt as a workaround 3. I also modified the CI script to handle all of these branches, so it should work from now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327829) Time Spent: 7h 50m (was: 7h 40m) > Add ability to SCM for creating multiple pipelines with same datanode > - > > Key: HDDS-1569 > URL: https://issues.apache.org/jira/browse/HDDS-1569 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Siddharth Wagle >Assignee: Li Cheng >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > > - Refactor _RatisPipelineProvider.create()_ to be able to create pipelines > with datanodes that are not a part of sufficient pipelines > - Define soft and hard upper bounds for pipeline membership > - Create SCMAllocationManager that can be leveraged to get a candidate set of > datanodes based on placement policies > - Add the datanodes to internal datastructures -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2302) Manage common pom versions in one common place
[ https://issues.apache.org/jira/browse/HDDS-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2302: - Labels: pull-request-available (was: ) > Manage common pom versions in one common place > -- > > Key: HDDS-2302 > URL: https://issues.apache.org/jira/browse/HDDS-2302 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > Some of the versions (eg. ozone.version, hdds.version, ratis.version) are > required for both ozone and hdds subprojects. As we have a common pom.xml it > can be safer to manage them in one common place at the root pom.xml instead > of managing them multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2302) Manage common pom versions in one common place
[ https://issues.apache.org/jira/browse/HDDS-2302?focusedWorklogId=327812&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327812 ] ASF GitHub Bot logged work on HDDS-2302: Author: ASF GitHub Bot Created on: 14/Oct/19 13:48 Start Date: 14/Oct/19 13:48 Worklog Time Spent: 10m Work Description: elek commented on pull request #21: HDDS-2302. Manage common pom versions in one common place URL: https://github.com/apache/hadoop-ozone/pull/21 ## What changes were proposed in this pull request? Some of the versions (eg. ozone.version, hdds.version, ratis.version) are required for both ozone and hdds subprojects. As we have a common pom.xml it can be safer to manage them in one common place at the root pom.xml instead of managing them multiple times. I would move some of the properties to the root pom.xml to make it easier to manage/change them. (For example change the ratis.version at only one place) ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2302 ## How this patch can be tested? Do a normal build. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327812) Remaining Estimate: 0h Time Spent: 10m > Manage common pom versions in one common place > -- > > Key: HDDS-2302 > URL: https://issues.apache.org/jira/browse/HDDS-2302 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Some of the versions (eg. ozone.version, hdds.version, ratis.version) are > required for both ozone and hdds subprojects. As we have a common pom.xml it > can be safer to manage them in one common place at the root pom.xml instead > of managing them multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Description: Benchmark: Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. This is done in a tight loop and multiple threads from client side to add enough load on CPU. Note that intention is to understand the bottlenecks in OM (intentionally avoiding interactions with SCM & DN). Observation: - During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every write operation. This turns out to be expensive and chokes the write path. [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] In most of the cases, directory creation would be fresh entry. In such cases, it would be good to try with {{RocksDB::keyMayExist.}} was: Benchmark: Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. This is done in a tight loop and multiple threads from client side to add enough load on CPU. Note that intention is to understand the bottlenecks in OM (intentionally avoiding interactions with SCM & DN). Observation: - During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every write operation. This turns out to be expensive and chokes the entire read path. [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] In most of the cases, directory creation would be fresh entry. In such cases, it would be good to try with {{RocksDB::keyMayExist.}} > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2302) Manage common pom versions in one common place
Marton Elek created HDDS-2302: - Summary: Manage common pom versions in one common place Key: HDDS-2302 URL: https://issues.apache.org/jira/browse/HDDS-2302 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: build Reporter: Marton Elek Assignee: Marton Elek Some of the versions (eg. ozone.version, hdds.version, ratis.version) are required for both ozone and hdds subprojects. As we have a common pom.xml it can be safer to manage them in one common place at the root pom.xml instead of managing them multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Attachment: (was: om_write_profile.png) > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Description: Benchmark: Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. This is done in a tight loop and multiple threads from client side to add enough load on CPU. Note that intention is to understand the bottlenecks in OM (intentionally avoiding interactions with SCM & DN). Observation: - During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every write operation. This turns out to be expensive and chokes the entire read path. [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] In most of the cases, directory creation would be fresh entry. In such cases, it would be good to try with {{RocksDB::keyMayExist.}} was: Benchmark: Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. This is done in a tight loop and multiple threads from client side to add enough load on CPU. Note that intention is to understand the bottlenecks in OM (intentionally avoiding interactions with SCM & DN). Observation: - During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every write operation. This turns out to be expensive and chokes the entire read path. https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 In most of the cases, directory creation would be fresh entry. In such cases, it would be good to try with {{RocksDB::keyMayExist}} > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png, om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Attachment: om_write_profile.jpg > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png, om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. > This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Attachment: om_write_profile.png > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png, om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. > This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Attachment: (was: om_write_profile.jpg) > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png, om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. > This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2301) Write path: Reducing read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HDDS-2301: --- Attachment: om_write_profile.png > Write path: Reducing read contention in rocksDB > --- > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Priority: Major > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. > This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the entire read > path. > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 > https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2301) Write path: Reducing read contention in rocksDB
Rajesh Balamohan created HDDS-2301: -- Summary: Write path: Reducing read contention in rocksDB Key: HDDS-2301 URL: https://issues.apache.org/jira/browse/HDDS-2301 Project: Hadoop Distributed Data Store Issue Type: Bug Affects Versions: 0.5.0 Reporter: Rajesh Balamohan Attachments: om_write_profile.png Benchmark: Simple benchmark which creates 100 and 1000s of keys (empty directory) in OM. This is done in a tight loop and multiple threads from client side to add enough load on CPU. Note that intention is to understand the bottlenecks in OM (intentionally avoiding interactions with SCM & DN). Observation: - During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every write operation. This turns out to be expensive and chokes the entire read path. https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155 https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63 In most of the cases, directory creation would be fresh entry. In such cases, it would be good to try with {{RocksDB::keyMayExist}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2196) Add CLI Commands and Protobuf messages to trigger decom states
[ https://issues.apache.org/jira/browse/HDDS-2196?focusedWorklogId=327775&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327775 ] ASF GitHub Bot logged work on HDDS-2196: Author: ASF GitHub Bot Created on: 14/Oct/19 12:02 Start Date: 14/Oct/19 12:02 Worklog Time Spent: 10m Work Description: sodonnel commented on pull request #17: HDDS-2196 Add CLI Commands and Protobuf messages to trigger decom states URL: https://github.com/apache/hadoop-ozone/pull/17 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327775) Time Spent: 1h 20m (was: 1h 10m) > Add CLI Commands and Protobuf messages to trigger decom states > -- > > Key: HDDS-2196 > URL: https://issues.apache.org/jira/browse/HDDS-2196 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM, SCM Client >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > To all nodes to be decommissioned, recommissioned and put into maintenance, > we need a few commands. > These will be added to the existing "scm cli". 3 commands are proposed: > Decommission: > ozone scmcli dnadmin decommission hosta hostb hostc:port ... > Put nodes into maintenance: > osone scmcli dnadmin maintenance hosta hostb hostc:port ... <-endHours> > Take nodes out of maintenance or halt decommission: > ozone scmcli dnadmin recommission hosta hostb hostc:port > These 3 commands will call 3 new protobuf messages and they will be part of > the "StorageContainerLocationProtocol": > * DecommissionNodesRequestProto > * RecommissionNodesRequestProto > * StartMaintenanceNodesRequestProto > In additional a new class NodeDecommissionManager will be introduced that > will receive these commands and carry out the decommission steps. > In this patch NodeDecommissionManager is only a skeleton implementation to > receive the commands as this patch is mainly focused on getting the CLI > commands and protobuf messages in place. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2196) Add CLI Commands and Protobuf messages to trigger decom states
[ https://issues.apache.org/jira/browse/HDDS-2196?focusedWorklogId=327774&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327774 ] ASF GitHub Bot logged work on HDDS-2196: Author: ASF GitHub Bot Created on: 14/Oct/19 12:02 Start Date: 14/Oct/19 12:02 Worklog Time Spent: 10m Work Description: sodonnel commented on pull request #20: HDDS-2196 Add CLI Commands and Protobuf messages to trigger decom states URL: https://github.com/apache/hadoop-ozone/pull/20 This change provides 3 new CLI commands: ``` scmcli dnadmin decommission hostname1 hostname2 hostname3 scmcli dnadmin maintenance hostname1 hostname2 hostname3 < --end time from now to end maintenance in hours> scmcli dnadmin recommission hostname1 hostname2 hostname3 ``` To allow for cases where many DNs are on the same host, the hostname can also have a port appended, eg: ``` scmcli dnadmin decommission hostname1:5678 hostname1:6789 hostname1:7890 ``` These commands make use of 3 new protobuf messages, defined in StorageContainerLocationProtocol: ``` DecommissionNodesRequestProto + DecommissionNodesResponseProto RecommissionNodesRequestProto + RecommissionNodesResponseProto StartMaintenanceNodesRequestProto + StartMaintenanceNodesResponseProto ``` All 3 accept a list of strings (for hostnames) and the maintenance message also allows an int to specify the end time in hours. These 3 commands make a call to a new class NodeDecommissionManager which takes the list of hosts and validates them. If any host is invalid or not part of the cluster, the entire command is failed and the CLI will show an error. Assuming the validation passes OK, the list of nodes will be switch into DECOMMISSIONING, ENTERING_MAINTENANCE or back into IN_SERVICE. At this point in time, there is no decommission logic present, the nodes will simply remain in the interm state forever. The actual decommissioning logic will be added in a further Jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327774) Time Spent: 1h 10m (was: 1h) > Add CLI Commands and Protobuf messages to trigger decom states > -- > > Key: HDDS-2196 > URL: https://issues.apache.org/jira/browse/HDDS-2196 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM, SCM Client >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > To all nodes to be decommissioned, recommissioned and put into maintenance, > we need a few commands. > These will be added to the existing "scm cli". 3 commands are proposed: > Decommission: > ozone scmcli dnadmin decommission hosta hostb hostc:port ... > Put nodes into maintenance: > osone scmcli dnadmin maintenance hosta hostb hostc:port ... <-endHours> > Take nodes out of maintenance or halt decommission: > ozone scmcli dnadmin recommission hosta hostb hostc:port > These 3 commands will call 3 new protobuf messages and they will be part of > the "StorageContainerLocationProtocol": > * DecommissionNodesRequestProto > * RecommissionNodesRequestProto > * StartMaintenanceNodesRequestProto > In additional a new class NodeDecommissionManager will be introduced that > will receive these commands and carry out the decommission steps. > In this patch NodeDecommissionManager is only a skeleton implementation to > receive the commands as this patch is mainly focused on getting the CLI > commands and protobuf messages in place. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2299) BlockManager should allocate a block in excluded pipelines if none other left
[ https://issues.apache.org/jira/browse/HDDS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2299: - Labels: pull-request-available (was: ) > BlockManager should allocate a block in excluded pipelines if none other left > - > > Key: HDDS-2299 > URL: https://issues.apache.org/jira/browse/HDDS-2299 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > > In SCM, BlockManager#allocateBlock does not allocate a block in the excluded > pipelines or datanodes if requested by the client. But there can be cases > where excluded pipelines and datanodes are the only ones left. In such a case > SCM should allocate a block in such pipelines and return to the client. The > client can choose to use or discard the block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2299) BlockManager should allocate a block in excluded pipelines if none other left
[ https://issues.apache.org/jira/browse/HDDS-2299?focusedWorklogId=327748&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327748 ] ASF GitHub Bot logged work on HDDS-2299: Author: ASF GitHub Bot Created on: 14/Oct/19 11:21 Start Date: 14/Oct/19 11:21 Worklog Time Spent: 10m Work Description: lokeshj1703 commented on pull request #19: HDDS-2299. BlockManager should allocate a block in excluded pipelines if none other left URL: https://github.com/apache/hadoop-ozone/pull/19 In SCM, BlockManager#allocateBlock does not allocate a block in the excluded pipelines or datanodes if requested by the client. But there can be cases where excluded pipelines or datanodes are the only ones left. In such a case SCM should allocate a block in such pipelines and return to the client. The client can choose to use or discard the block. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327748) Remaining Estimate: 0h Time Spent: 10m > BlockManager should allocate a block in excluded pipelines if none other left > - > > Key: HDDS-2299 > URL: https://issues.apache.org/jira/browse/HDDS-2299 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In SCM, BlockManager#allocateBlock does not allocate a block in the excluded > pipelines or datanodes if requested by the client. But there can be cases > where excluded pipelines and datanodes are the only ones left. In such a case > SCM should allocate a block in such pipelines and return to the client. The > client can choose to use or discard the block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2299) BlockManager should allocate a block in excluded pipelines if none other left
[ https://issues.apache.org/jira/browse/HDDS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-2299: -- Description: In SCM, BlockManager#allocateBlock does not allocate a block in the excluded pipelines or datanodes if requested by the client. But there can be cases where excluded pipelines and datanodes are the only ones left. In such a case SCM should allocate a block in such pipelines and return to the client. The client can choose to use or discard the block. (was: In SCM, BlockManager#allocateBlock does not allocate a block in the excluded pipelines or datanodes if requested by the client. But there can be cases where excluded pipelines are the only pipelines left. In such a case SCM should allocate a block in such pipelines and return to the client. The client can choose to use or discard the block.) > BlockManager should allocate a block in excluded pipelines if none other left > - > > Key: HDDS-2299 > URL: https://issues.apache.org/jira/browse/HDDS-2299 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > > In SCM, BlockManager#allocateBlock does not allocate a block in the excluded > pipelines or datanodes if requested by the client. But there can be cases > where excluded pipelines and datanodes are the only ones left. In such a case > SCM should allocate a block in such pipelines and return to the client. The > client can choose to use or discard the block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2300) Publish normalized Ratis metrics via the prometheus endpoint
[ https://issues.apache.org/jira/browse/HDDS-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950896#comment-16950896 ] Marton Elek commented on HDDS-2300: --- This patch requires RATIS-702. The WIP state can be found in elek/hadoop-ozone HDDS-2300 branch. > Publish normalized Ratis metrics via the prometheus endpoint > > > Key: HDDS-2300 > URL: https://issues.apache.org/jira/browse/HDDS-2300 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > > Latest Ratis contains very good metrics about the status of the ratis ring. > After RATIS-702 it will be possible to adjust the repoter of the Dropwizard > based ratis metrics and export them directly to the /prom http endpoint (used > by ozone insight and ratis). > Unfortunately Dropwizard is very simple, there is no tag support. All of the > instance specific strings are part of the metric name. For example: > {code:java} > "ratis_grpc.log_appender.72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67@group" > + "-72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67" > + ".grpc_log_appender_follower_75fa730a-59f0-4547" > + "-bd68-216162c263eb_latency", {code} > In this patch I will use a simple method: during the export of the dropwizard > metrics based on the well known format of the ratis metrics, they are > converted to proper prometheus metrics where the instance information is > included as tags: > {code:java} > ratis_grpc.log_appender.grpc_log_appender_follower_latency{instance="72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"} > {code} > With this approach we can: > 1. monitor easily all the Ratis pipelines with one simple query > 2. Use the metrics for ozone insight which will show health state of the > Ratis pipeline -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2300) Publish normalized Ratis metrics via the prometheus endpoint
Marton Elek created HDDS-2300: - Summary: Publish normalized Ratis metrics via the prometheus endpoint Key: HDDS-2300 URL: https://issues.apache.org/jira/browse/HDDS-2300 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Marton Elek Assignee: Marton Elek Latest Ratis contains very good metrics about the status of the ratis ring. After RATIS-702 it will be possible to adjust the repoter of the Dropwizard based ratis metrics and export them directly to the /prom http endpoint (used by ozone insight and ratis). Unfortunately Dropwizard is very simple, there is no tag support. All of the instance specific strings are part of the metric name. For example: {code:java} "ratis_grpc.log_appender.72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67@group" + "-72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67" + ".grpc_log_appender_follower_75fa730a-59f0-4547" + "-bd68-216162c263eb_latency", {code} In this patch I will use a simple method: during the export of the dropwizard metrics based on the well known format of the ratis metrics, they are converted to proper prometheus metrics where the instance information is included as tags: {code:java} ratis_grpc.log_appender.grpc_log_appender_follower_latency{instance="72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"} {code} With this approach we can: 1. monitor easily all the Ratis pipelines with one simple query 2. Use the metrics for ozone insight which will show health state of the Ratis pipeline -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2298) Fix maven warning about duplicated metrics-core jar
[ https://issues.apache.org/jira/browse/HDDS-2298?focusedWorklogId=327723&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327723 ] ASF GitHub Bot logged work on HDDS-2298: Author: ASF GitHub Bot Created on: 14/Oct/19 10:25 Start Date: 14/Oct/19 10:25 Worklog Time Spent: 10m Work Description: elek commented on pull request #18: HDDS-2298. Fix maven warning about duplicated metrics-core jar URL: https://github.com/apache/hadoop-ozone/pull/18 ## What changes were proposed in this pull request? Maven build of Ozone is starting with a warning: ``` [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ line 94, column 17 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] ``` It's better to avoid it. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2298 ## How this patch can be tested? ``` mvn clean install | head -n 20 ``` Without patch: there are warnings. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327723) Remaining Estimate: 0h Time Spent: 10m > Fix maven warning about duplicated metrics-core jar > --- > > Key: HDDS-2298 > URL: https://issues.apache.org/jira/browse/HDDS-2298 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Maven build of Ozone is starting with a warning: > {code:java} > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT > [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must > be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ > line 94, column 17 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > {code} > It's better to avoid it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2298) Fix maven warning about duplicated metrics-core jar
[ https://issues.apache.org/jira/browse/HDDS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2298: - Labels: pull-request-available (was: ) > Fix maven warning about duplicated metrics-core jar > --- > > Key: HDDS-2298 > URL: https://issues.apache.org/jira/browse/HDDS-2298 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > Maven build of Ozone is starting with a warning: > {code:java} > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT > [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must > be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ > line 94, column 17 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > {code} > It's better to avoid it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6524) Choosing datanode retries times considering with block replica number
[ https://issues.apache.org/jira/browse/HDFS-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950878#comment-16950878 ] Surendra Singh Lilhore commented on HDFS-6524: -- -1 for this change, There is no relation between replication and failures. > Choosing datanode retries times considering with block replica number > -- > > Key: HDFS-6524 > URL: https://issues.apache.org/jira/browse/HDFS-6524 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha1 >Reporter: Liang Xie >Assignee: Lisheng Sun >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-6524.001.patch, HDFS-6524.002.patch, > HDFS-6524.003.patch, HDFS-6524.004.patch, HDFS-6524.005(2).patch, > HDFS-6524.005.patch, HDFS-6524.006.patch, HDFS-6524.007.patch, HDFS-6524.txt > > > Currently the chooseDataNode() does retry with the setting: > dfsClientConf.maxBlockAcquireFailures, which by default is 3 > (DFS_CLIENT_MAX_BLOCK_ACQUIRE_FAILURES_DEFAULT = 3), it would be better > having another option, block replication factor. One cluster with only two > block replica setting, or using Reed-solomon encoding solution with one > replica factor. It helps to reduce the long tail latency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6524) Choosing datanode retries times considering with block replica number
[ https://issues.apache.org/jira/browse/HDFS-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950871#comment-16950871 ] Ayush Saxena commented on HDFS-6524: bq. Looks like a good improvement, thanks Lisheng Sun. [~weichiu] This doesn't seems to be an improvement to me, This seems wrong fix to me, we can't do this change, this failures has no relation with replication count, it is the count for refetching block locations for Namenode, which is independent of number of replicas. I am not sure what I am missing. [~surendrasingh] can you also give a check once!!! > Choosing datanode retries times considering with block replica number > -- > > Key: HDFS-6524 > URL: https://issues.apache.org/jira/browse/HDFS-6524 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha1 >Reporter: Liang Xie >Assignee: Lisheng Sun >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-6524.001.patch, HDFS-6524.002.patch, > HDFS-6524.003.patch, HDFS-6524.004.patch, HDFS-6524.005(2).patch, > HDFS-6524.005.patch, HDFS-6524.006.patch, HDFS-6524.007.patch, HDFS-6524.txt > > > Currently the chooseDataNode() does retry with the setting: > dfsClientConf.maxBlockAcquireFailures, which by default is 3 > (DFS_CLIENT_MAX_BLOCK_ACQUIRE_FAILURES_DEFAULT = 3), it would be better > having another option, block replication factor. One cluster with only two > block replica setting, or using Reed-solomon encoding solution with one > replica factor. It helps to reduce the long tail latency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2299) BlockManager should allocate a block in excluded pipelines if none other left
Lokesh Jain created HDDS-2299: - Summary: BlockManager should allocate a block in excluded pipelines if none other left Key: HDDS-2299 URL: https://issues.apache.org/jira/browse/HDDS-2299 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Lokesh Jain Assignee: Lokesh Jain In SCM, BlockManager#allocateBlock does not allocate a block in the excluded pipelines or datanodes if requested by the client. But there can be cases where excluded pipelines are the only pipelines left. In such a case SCM should allocate a block in such pipelines and return to the client. The client can choose to use or discard the block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2196) Add CLI Commands and Protobuf messages to trigger decom states
[ https://issues.apache.org/jira/browse/HDDS-2196?focusedWorklogId=327701&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327701 ] ASF GitHub Bot logged work on HDDS-2196: Author: ASF GitHub Bot Created on: 14/Oct/19 09:43 Start Date: 14/Oct/19 09:43 Worklog Time Spent: 10m Work Description: sodonnel commented on pull request #1615: HDDS-2196 Add CLI Commands and Protobuf messages to trigger decom states URL: https://github.com/apache/hadoop/pull/1615 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327701) Time Spent: 1h (was: 50m) > Add CLI Commands and Protobuf messages to trigger decom states > -- > > Key: HDDS-2196 > URL: https://issues.apache.org/jira/browse/HDDS-2196 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM, SCM Client >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > To all nodes to be decommissioned, recommissioned and put into maintenance, > we need a few commands. > These will be added to the existing "scm cli". 3 commands are proposed: > Decommission: > ozone scmcli dnadmin decommission hosta hostb hostc:port ... > Put nodes into maintenance: > osone scmcli dnadmin maintenance hosta hostb hostc:port ... <-endHours> > Take nodes out of maintenance or halt decommission: > ozone scmcli dnadmin recommission hosta hostb hostc:port > These 3 commands will call 3 new protobuf messages and they will be part of > the "StorageContainerLocationProtocol": > * DecommissionNodesRequestProto > * RecommissionNodesRequestProto > * StartMaintenanceNodesRequestProto > In additional a new class NodeDecommissionManager will be introduced that > will receive these commands and carry out the decommission steps. > In this patch NodeDecommissionManager is only a skeleton implementation to > receive the commands as this patch is mainly focused on getting the CLI > commands and protobuf messages in place. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2196) Add CLI Commands and Protobuf messages to trigger decom states
[ https://issues.apache.org/jira/browse/HDDS-2196?focusedWorklogId=327699&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327699 ] ASF GitHub Bot logged work on HDDS-2196: Author: ASF GitHub Bot Created on: 14/Oct/19 09:42 Start Date: 14/Oct/19 09:42 Worklog Time Spent: 10m Work Description: sodonnel commented on pull request #17: HDDS-2196 Add CLI Commands and Protobuf messages to trigger decom states URL: https://github.com/apache/hadoop-ozone/pull/17 This change provides 3 new CLI commands: ``` scmcli dnadmin decommission hostname1 hostname2 hostname3 scmcli dnadmin maintenance hostname1 hostname2 hostname3 < --end time from now to end maintenance in hours> scmcli dnadmin recommission hostname1 hostname2 hostname3 ``` To allow for cases where many DNs are on the same host, the hostname can also have a port appended, eg: ``` scmcli dnadmin decommission hostname1:5678 hostname1:6789 hostname1:7890 ``` These commands make use of 3 new protobuf messages, defined in StorageContainerLocationProtocol: ``` DecommissionNodesRequestProto + DecommissionNodesResponseProto RecommissionNodesRequestProto + RecommissionNodesResponseProto StartMaintenanceNodesRequestProto + StartMaintenanceNodesResponseProto ``` All 3 accept a list of strings (for hostnames) and the maintenance message also allows an int to specify the end time in hours. These 3 commands make a call to a new class NodeDecommissionManager which takes the list of hosts and validates them. If any host is invalid or not part of the cluster, the entire command is failed and the CLI will show an error. Assuming the validation passes OK, the list of nodes will be switch into DECOMMISSIONING, ENTERING_MAINTENANCE or back into IN_SERVICE. At this point in time, there is no decommission logic present, the nodes will simply remain in the interm state forever. The actual decommissioning logic will be added in a further Jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327699) Time Spent: 50m (was: 40m) > Add CLI Commands and Protobuf messages to trigger decom states > -- > > Key: HDDS-2196 > URL: https://issues.apache.org/jira/browse/HDDS-2196 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM, SCM Client >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > To all nodes to be decommissioned, recommissioned and put into maintenance, > we need a few commands. > These will be added to the existing "scm cli". 3 commands are proposed: > Decommission: > ozone scmcli dnadmin decommission hosta hostb hostc:port ... > Put nodes into maintenance: > osone scmcli dnadmin maintenance hosta hostb hostc:port ... <-endHours> > Take nodes out of maintenance or halt decommission: > ozone scmcli dnadmin recommission hosta hostb hostc:port > These 3 commands will call 3 new protobuf messages and they will be part of > the "StorageContainerLocationProtocol": > * DecommissionNodesRequestProto > * RecommissionNodesRequestProto > * StartMaintenanceNodesRequestProto > In additional a new class NodeDecommissionManager will be introduced that > will receive these commands and carry out the decommission steps. > In this patch NodeDecommissionManager is only a skeleton implementation to > receive the commands as this patch is mainly focused on getting the CLI > commands and protobuf messages in place. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2298) Fix maven warning about duplicated metrics-core jar
Marton Elek created HDDS-2298: - Summary: Fix maven warning about duplicated metrics-core jar Key: HDDS-2298 URL: https://issues.apache.org/jira/browse/HDDS-2298 Project: Hadoop Distributed Data Store Issue Type: Bug Components: build Reporter: Marton Elek Assignee: Marton Elek Maven build of Ozone is starting with a warning: {code:java} [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.hadoop:hadoop-ozone-tools:jar:0.5.0-SNAPSHOT [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: io.dropwizard.metrics:metrics-core:jar -> version 3.2.4 vs (?) @ line 94, column 17 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] {code} It's better to avoid it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2297) Enable Opentracing for new Freon tests
[ https://issues.apache.org/jira/browse/HDDS-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2297: - Labels: pull-request-available (was: ) > Enable Opentracing for new Freon tests > -- > > Key: HDDS-2297 > URL: https://issues.apache.org/jira/browse/HDDS-2297 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: freon >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > HDDS-2022 introduced new freon tests, but the initial root span of > opentracing is not created before the test execution. We need to enable > opentracing to get better view about the executions of the new freon test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2297) Enable Opentracing for new Freon tests
[ https://issues.apache.org/jira/browse/HDDS-2297?focusedWorklogId=327695&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327695 ] ASF GitHub Bot logged work on HDDS-2297: Author: ASF GitHub Bot Created on: 14/Oct/19 09:32 Start Date: 14/Oct/19 09:32 Worklog Time Spent: 10m Work Description: elek commented on pull request #16: HDDS-2297. Enable Opentracing for new Freon tests URL: https://github.com/apache/hadoop-ozone/pull/16 ## What changes were proposed in this pull request? HDDS-2022 introduced new freon tests, but the initial root span of opentracing is not created before the test execution. We need to enable opentracing to get better view about the executions of the new freon test. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2297 ## How this patch can be tested? Start an ozoneperf cluster: ``` cd hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozoneperf/ docker-compose up -d --scale datanode=3 ``` If HDDS-2296 is not yet merged, stop old-style freon: ``` docker-compose stop freon ``` Start new freon test: ``` docker-compose exec scm bash ozone freon ockg -n 10 ``` Open the jaeger web ui: http://localhost:16686/ Choose _freon_ under the service and click to the _search_. On the right side you should see results with the name _freon: OzoneClientKeyGenerator_ and with multiple sub-spans. Without the patch you can see entries with strange name (...span without root...) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327695) Remaining Estimate: 0h Time Spent: 10m > Enable Opentracing for new Freon tests > -- > > Key: HDDS-2297 > URL: https://issues.apache.org/jira/browse/HDDS-2297 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: freon >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDDS-2022 introduced new freon tests, but the initial root span of > opentracing is not created before the test execution. We need to enable > opentracing to get better view about the executions of the new freon test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1569) Add ability to SCM for creating multiple pipelines with same datanode
[ https://issues.apache.org/jira/browse/HDDS-1569?focusedWorklogId=327693&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327693 ] ASF GitHub Bot logged work on HDDS-1569: Author: ASF GitHub Bot Created on: 14/Oct/19 09:30 Start Date: 14/Oct/19 09:30 Worklog Time Spent: 10m Work Description: timmylicheng commented on issue #1431: HDDS-1569 Support creating multiple pipelines with same datanode URL: https://github.com/apache/hadoop/pull/1431#issuecomment-541576917 > > Ok I will send out a new PR. How to do mvn build under new repo now? I was not able to do it under hadoop-ozone directory. > > Thank you very much, to take care about the migration of your PRs. (Unfortunately I can't do it with github api as I can't fake the reporter and I would like to keep it) > > Regarding the build in the new repo. You can do it from the root level of the project: > > 1. do `mvn clean install -f pom.ozone.xml -DskipTests` > 2. or rebase and do a simple `mvn clean install -DskipTests` > > (( > > 1. One of the benefit to use separated repo is to create new README/CONTRIBUTION.md where we can add these information. I opened HDDS-2292 and HDDS-2293. > 2. The other benefit is to use simple top level pom.xml. I just merged #10, but you need to rebase to use it. >)) CI build failed on new PR: https://github.com/apache/hadoop-ozone/pull/13. Could you please take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327693) Time Spent: 7h 40m (was: 7.5h) > Add ability to SCM for creating multiple pipelines with same datanode > - > > Key: HDDS-1569 > URL: https://issues.apache.org/jira/browse/HDDS-1569 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Siddharth Wagle >Assignee: Li Cheng >Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > > - Refactor _RatisPipelineProvider.create()_ to be able to create pipelines > with datanodes that are not a part of sufficient pipelines > - Define soft and hard upper bounds for pipeline membership > - Create SCMAllocationManager that can be leveraged to get a candidate set of > datanodes based on placement policies > - Add the datanodes to internal datastructures -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950849#comment-16950849 ] Feilong He commented on HDFS-14740: --- [~Rui Mo], please prepare a design doc and test doc, then upload them to this JIra. Thanks! > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2297) Enable Opentracing for new Freon tests
Marton Elek created HDDS-2297: - Summary: Enable Opentracing for new Freon tests Key: HDDS-2297 URL: https://issues.apache.org/jira/browse/HDDS-2297 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: freon Reporter: Marton Elek Assignee: Marton Elek HDDS-2022 introduced new freon tests, but the initial root span of opentracing is not created before the test execution. We need to enable opentracing to get better view about the executions of the new freon test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2296) ozoneperf compose cluster shouln't start freon by default
[ https://issues.apache.org/jira/browse/HDDS-2296?focusedWorklogId=327682&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327682 ] ASF GitHub Bot logged work on HDDS-2296: Author: ASF GitHub Bot Created on: 14/Oct/19 09:06 Start Date: 14/Oct/19 09:06 Worklog Time Spent: 10m Work Description: elek commented on pull request #15: HDDS-2296. ozoneperf compose cluster shouln't start freon by default URL: https://github.com/apache/hadoop-ozone/pull/15 ## What changes were proposed in this pull request? During the original creation of the `compose/ozoneperf` we added an example freon execution to make it clean how the data can be generated. This freon process starts all the time when ozoneperf cluster is started (usually I notice it when my CPU starts to use 100% of the available resources). Since the creation of this cluster definition we implemented multiple type of freon tests and it's hard predict which tests should be executed. I propose to remove the default execution of the random key generation but keep the opportunity to run any of the tests. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2296 ## How this patch can be tested? Go to the `compose/ozoneperf` dir and follow the updated README ;-) (TLDR; start the cluster, wait, start freon with the documented command) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327682) Remaining Estimate: 0h Time Spent: 10m > ozoneperf compose cluster shouln't start freon by default > - > > Key: HDDS-2296 > URL: https://issues.apache.org/jira/browse/HDDS-2296 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: docker >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > During the original creation of the compose/ozoneperf we added an example > freon execution to make it clean how the data can be generated. This freon > process starts all the time when ozoneperf cluster is started (usually I > notice it when my CPU starts to use 100% of the available resources). > Since the creation of this cluster definition we implemented multiple type of > freon tests and it's hard predict which tests should be executed. I propose > to remove the default execution of the random key generation but keep the > opportunity to run any of the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2296) ozoneperf compose cluster shouln't start freon by default
[ https://issues.apache.org/jira/browse/HDDS-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2296: - Labels: pull-request-available (was: ) > ozoneperf compose cluster shouln't start freon by default > - > > Key: HDDS-2296 > URL: https://issues.apache.org/jira/browse/HDDS-2296 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: docker >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > During the original creation of the compose/ozoneperf we added an example > freon execution to make it clean how the data can be generated. This freon > process starts all the time when ozoneperf cluster is started (usually I > notice it when my CPU starts to use 100% of the available resources). > Since the creation of this cluster definition we implemented multiple type of > freon tests and it's hard predict which tests should be executed. I propose > to remove the default execution of the random key generation but keep the > opportunity to run any of the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2296) ozoneperf compose cluster shouln't start freon by default
Marton Elek created HDDS-2296: - Summary: ozoneperf compose cluster shouln't start freon by default Key: HDDS-2296 URL: https://issues.apache.org/jira/browse/HDDS-2296 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: docker Reporter: Marton Elek Assignee: Marton Elek During the original creation of the compose/ozoneperf we added an example freon execution to make it clean how the data can be generated. This freon process starts all the time when ozoneperf cluster is started (usually I notice it when my CPU starts to use 100% of the available resources). Since the creation of this cluster definition we implemented multiple type of freon tests and it's hard predict which tests should be executed. I propose to remove the default execution of the random key generation but keep the opportunity to run any of the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2295) Display log of freon on the standard output
[ https://issues.apache.org/jira/browse/HDDS-2295?focusedWorklogId=327676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327676 ] ASF GitHub Bot logged work on HDDS-2295: Author: ASF GitHub Bot Created on: 14/Oct/19 08:50 Start Date: 14/Oct/19 08:50 Worklog Time Spent: 10m Work Description: elek commented on pull request #14: HDDS-2295. Display log of freon on the standard output URL: https://github.com/apache/hadoop-ozone/pull/14 ## What changes were proposed in this pull request? HDDS-2042 disabled the console logging for all of the ozone command line tools including freon. But freon is different, it has a different error handling model. For freon we need all the log on the console. 1. To follow all the different errors 2. To get information about the used (random) prefix which can be reused during the validation phase. I propose to restore the original behavior for Ozone. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2295 ## How was this patch tested? Start any of the docker compose cluster and do a ``` ozone freon ockg -n10 ``` You should see the logs (for example the notice about the used prefix) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327676) Remaining Estimate: 0h Time Spent: 10m > Display log of freon on the standard output > --- > > Key: HDDS-2295 > URL: https://issues.apache.org/jira/browse/HDDS-2295 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDDS-2042 disabled the console logging for all of the ozone command line > tools including freon. > But freon is different, it has a different error handling model. For freon we > need all the log on the console. > 1. To follow all the different errors > 2. To get information about the used (random) prefix which can be reused > during the validation phase. > > I propose to restore the original behavior for Ozone. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org