[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973999#comment-16973999 ] Hadoop QA commented on HDFS-14924: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 51s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 622 unchanged - 0 fixed = 623 total (was 622) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 45s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 59s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14924 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985812/HDFS-14924.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d02ff0ab9add 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73a386a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28310/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28310/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2478: Description: Sonar issues : https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 was: Sonar issue : https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > Sonar : remove temporary variable in XceiverClientSpi.sendCommand > - > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2478: Summary: Sonar : remove temporary variable in XceiverClientGrpc.sendCommand (was: Sonar : remove temporary variable in XceiverClientSpi.sendCommand) > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect
Supratim Deka created HDDS-2480: --- Summary: Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect Key: HDDS-2480 URL: https://issues.apache.org/jira/browse/HDDS-2480 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsWE=AW5md_AGKcVY8lQ4ZsWE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
Supratim Deka created HDDS-2479: --- Summary: Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry Key: HDDS-2479 URL: https://issues.apache.org/jira/browse/HDDS-2479 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand
Supratim Deka created HDDS-2478: --- Summary: Sonar : remove temporary variable in XceiverClientSpi.sendCommand Key: HDDS-2478 URL: https://issues.apache.org/jira/browse/HDDS-2478 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue : https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?focusedWorklogId=343136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343136 ] ASF GitHub Bot logged work on HDDS-2473: Author: ASF GitHub Bot Created on: 14/Nov/19 05:24 Start Date: 14/Nov/19 05:24 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #162: HDDS-2473. Fix code reliability issues found by Sonar in Ozone Recon module. URL: https://github.com/apache/hadoop-ozone/pull/162 ## What changes were proposed in this pull request? Sonarcloud.io has flagged a number of code reliability issues in Ozone recon (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). Following issues have been fixed. - Double Brace Initialization should not be used - Resources should be closed - InterruptedException should not be ignored ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2473 ## How was this patch tested? Ran unit tests in 'recon' module. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343136) Remaining Estimate: 0h Time Spent: 10m > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2473: - Labels: pull-request-available (was: ) > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
[ https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=343135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343135 ] ASF GitHub Bot logged work on HDDS-2472: Author: ASF GitHub Bot Created on: 14/Nov/19 05:22 Start Date: 14/Nov/19 05:22 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #161: HDDS-2472. Use try-with-resources while creating FlushOptions in RDBStore URL: https://github.com/apache/hadoop-ozone/pull/161 ## What changes were proposed in this pull request? Use try-with-resources while creating FlushOptions in RDBStore class. Remove code duplication in getCheckpoint method. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2472 ## How was this patch tested? Unit tested. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343135) Time Spent: 0.5h (was: 20m) > Use try-with-resources while creating FlushOptions in RDBStore. > --- > > Key: HDDS-2472 > URL: https://issues.apache.org/jira/browse/HDDS-2472 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Link to the sonar issue flag - > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973933#comment-16973933 ] hemanthboyina commented on HDFS-14924: -- there was a similar kind of issue with createSnapshot in HDFS-14922 . updated the patch , please review . > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14924: - Attachment: HDFS-14924.002.patch > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command
[ https://issues.apache.org/jira/browse/HDFS-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravuri Sushma sree reassigned HDFS-14987: - Assignee: Ravuri Sushma sree > EC: EC file blockId location info displaying as "null" with hdfs fsck > -blockId command > -- > > Key: HDFS-14987 > URL: https://issues.apache.org/jira/browse/HDFS-14987 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, tools >Affects Versions: 3.1.2 >Reporter: Souryakanta Dwivedy >Assignee: Ravuri Sushma sree >Priority: Major > Attachments: EC_file_block_info.PNG, > image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, > image-2019-11-13-18-38-18-899.png > > > EC file blockId location info displaying as "null" with hdfs fsck -blockId > command > * Check the blockId information of an EC enabled file with "hdfs fsck > --blockId"- Check the blockId information of an EC enabled file with "hdfs > fsck -blockId" blockId location related info will display as null,which > needs to be rectified. > Check the attachment "EC_file_block_info" > === > !image-2019-11-13-18-34-00-067.png! > > * Check the output of a normal file block to compare > !image-2019-11-13-18-36-29-063.png! > === > !image-2019-11-13-18-38-18-899.png! > * Actual Output :- null > * Expected output :- It should display the blockId location related info as > (nodes, racks) of the block as specified in the usage info of fsck -blockId > option. [like : Block replica on > datanode/rack: BLR1xx038/default-rack is HEALTHY] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973916#comment-16973916 ] Aiphago commented on HDFS-14986: I means I can fix the dead lock problem with add dataset lock,my first comment have say the solution. > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973910#comment-16973910 ] Lisheng Sun commented on HDFS-14986: [~Aiphag0] if add dataset lock at FsDatasetImpl#deepCopyReplica in trunk, it will happen dead lock. I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with datasetLock call FsDatasetImpl#deepCopyReplica in another Thread when dn start. https://issues.apache.org/jira/browse/HDFS-14313?focusedCommentId=16887859=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16887859 > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973906#comment-16973906 ] Aiphago commented on HDFS-14986: We use the branch before 2.8 with the synchronized,and we fix the deadlock problem.And I think it's better to and add dataset lock at FsDatasetImpl#deepCopyReplica in trunk,and the method to slove deadlock problem is the same. > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent
[ https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973897#comment-16973897 ] Chris Teoh commented on HDDS-1847: -- Thanks very much for your help [~aengineer]. Really appreciate it! :)(y) > Datanode Kerberos principal and keytab config key looks inconsistent > > > Key: HDDS-1847 > URL: https://issues.apache.org/jira/browse/HDDS-1847 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Eric Yang >Assignee: Chris Teoh >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Ozone Kerberos configuration can be very confusing: > | config name | Description | > | hdds.scm.kerberos.principal | SCM service principal | > | hdds.scm.kerberos.keytab.file | SCM service keytab file | > | ozone.om.kerberos.principal | Ozone Manager service principal | > | ozone.om.kerberos.keytab.file | Ozone Manager keytab file | > | hdds.scm.http.kerberos.principal | SCM service spnego principal | > | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file | > | ozone.om.http.kerberos.principal | Ozone Manager spnego principal | > | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file | > | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file | > | hdds.datanode.http.kerberos.principal | Datanode spnego principal | > | dfs.datanode.kerberos.principal | Datanode service principal | > | dfs.datanode.keytab.file | Datanode service keytab file | > The prefix are very different for each of the datanode configuration. It > would be nice to have some consistency for datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973892#comment-16973892 ] Konstantin Shvachko commented on HDFS-14973: (1) Good point [~xkrogen] . I think Balancer, Mover, and SPS should all be limited to the same rate of getBlocks to NameNode. This is probably another good reason to track it inside {{NameNodeConnector}}. Would be good to reflect this in the description in {{hdfs-site.xml}} (2) Sure let's leave refactoring of TestBalancer for the next time. Nits: # {{dispatchBlockMoves()}} does not need {{dSec}} anymore. # Seems would be good for {{dispatchBlockMoves()}} to still log {{concurrentThreads}} in debug mode. > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2383) Closing open container via SCMCli throws exception
[ https://issues.apache.org/jira/browse/HDDS-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2383. --- Resolution: Duplicate > Closing open container via SCMCli throws exception > -- > > Key: HDDS-2383 > URL: https://issues.apache.org/jira/browse/HDDS-2383 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > > This was observed in apache master branch. > Closing the container via {{SCMCli}} throws the following exception, though > the container ends up getting closed eventually. > {noformat} > 2019-10-30 02:44:41,794 INFO > org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService: Block deletion > txnID mismatch in datanode 79626ba3-1957-46e5-a8b0-32d7f47fb801 for > containerID 6. Datanode delete txnID: 0, SCM txnID: 1004 > 2019-10-30 02:44:41,810 INFO > org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: > Moving container #4 to CLOSED state, datanode > 8885d4ba-228a-4fd2-bf5a-831f01594c6c{ip: 10.17.234.37, host: > vd1327.halxg.cloudera.com, networkLocation: /default-rack, certSerialId: > null} reported CLOSED replica. > 2019-10-30 02:44:41,826 INFO > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer: Object type > container id 4 op close new stage complete > 2019-10-30 02:44:41,826 ERROR > org.apache.hadoop.hdds.scm.container.ContainerStateManager: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > 2019-10-30 02:44:41,826 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 6 on 9860, call Call#3 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.submitRequest > from 10.17.234.32:45926 > org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > at > org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:338) > at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:326) > at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:388) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:303) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:158) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB$$Lambda$152/2036820231.apply(Unknown > Source) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:30454) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14988) HDFS should avoid read/write data from slow disks.
yimeng created HDFS-14988: - Summary: HDFS should avoid read/write data from slow disks. Key: HDFS-14988 URL: https://issues.apache.org/jira/browse/HDFS-14988 Project: Hadoop HDFS Issue Type: Improvement Components: block placement, datanode Affects Versions: 3.2.1, 3.1.1 Reporter: yimeng Slow disk causes real-time service(such as HBase ) become slowdown. The slow disk detection is added to the HDFS-11461, but only the slow disk is recorded in the Metric. I hope to further handle the detected slow disk. In my view, the slow disk can be added to the read data policy. If the block is on the slow disk of the DataNode, the block of other DataNodes is selected. For write data, slow disks can be added to the data write policy. We can remove the slow disk from all disks and then select a disk to write data based on dfs.datanode.fsdataset.volume.choosing.policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973862#comment-16973862 ] Íñigo Goiri commented on HDFS-14983: [~aajisaka], true, that's missing. > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Minor > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1940) Closing open container via scmcli gives false error message
[ https://issues.apache.org/jira/browse/HDDS-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-1940: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) [~adoroszlai] Thanks for review and testing the patch. [~nanda] Thanks for the contribution. I have committed this patch to the master branch. > Closing open container via scmcli gives false error message > --- > > Key: HDDS-1940 > URL: https://issues.apache.org/jira/browse/HDDS-1940 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Nanda kumar >Priority: Minor > Labels: incompatibleChange, pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{scmcli close}} prints an error message about invalid state transition after > it had successfully closed the container. > {code:title=CLI} > $ ozone scmcli info 2 > ... > Container State: OPEN > ... > $ ozone scmcli close 2 > ... > client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f: receive > RaftClientReply:client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f@group-7831D6F2EF1B, > cid=0, SUCCESS, logIndex=11, > commits[f27bf787-8711-41d4-b0fd-3ef50b5c076f:c12, > 37ba33fe-c9ed-4ac2-a6e5-57ce658168b4:c11, > feb68ba4-0a8a-4eda-9915-7dc090e5f46c:c11] > Failed to update container state #2, reason: invalid state transition from > state: CLOSED upon event: CLOSE. > $ ozone scmcli info 2 > ... > Container State: CLOSED > ... > {code} > {code:title=logs} > scm_1 | 2019-08-09 15:15:01 [IPC Server handler 1 on 9860] INFO > SCMClientProtocolServer:366 - Object type container id 1 op close new stage > begin > dn3_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > dn1_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > scm_1 | 2019-08-09 15:15:02 > [EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] > INFO IncrementalContainerReportHandler:176 - Moving container #1 to CLOSED > state, datanode feb68ba4-0a8a-4eda-9915-7dc090e5f46c{ip: 10.5.1.6, host: > ozone-static_dn3_1.ozone-static_net, networkLocation: /default-rack, > certSerialId: null} reported CLOSED replica. > dn2_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO > SCMClientProtocolServer:366 - Object type container id 1 op close new stage > complete > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] ERROR > ContainerStateManager:335 - Failed to update container state #1, reason: > invalid state transition from state: CLOSED upon event: CLOSE. > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO Server:2726 > - IPC Server handler 3 on 9860, call Call#3 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.notifyObjectStageChange > from 10.5.0.71:57746 > scm_1 | org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update > container state #1, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > scm_1 | at > org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:336) > scm_1 | at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:312) > scm_1 | at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:379) > scm_1 | at > org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:219) > scm_1 | at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16398) > scm_1 | at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > scm_1 | at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > scm_1 | at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > scm_1 | at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > scm_1 | at java.base/java.security.AccessController.doPrivileged(Native > Method) > scm_1 | at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > scm_1 | at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > scm_1 | at
[jira] [Work logged] (HDDS-1940) Closing open container via scmcli gives false error message
[ https://issues.apache.org/jira/browse/HDDS-1940?focusedWorklogId=343059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343059 ] ASF GitHub Bot logged work on HDDS-1940: Author: ASF GitHub Bot Created on: 14/Nov/19 01:26 Start Date: 14/Nov/19 01:26 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #153: HDDS-1940. Closing open container via scmcli gives false error message. URL: https://github.com/apache/hadoop-ozone/pull/153 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343059) Time Spent: 20m (was: 10m) > Closing open container via scmcli gives false error message > --- > > Key: HDDS-1940 > URL: https://issues.apache.org/jira/browse/HDDS-1940 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Nanda kumar >Priority: Minor > Labels: incompatibleChange, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {{scmcli close}} prints an error message about invalid state transition after > it had successfully closed the container. > {code:title=CLI} > $ ozone scmcli info 2 > ... > Container State: OPEN > ... > $ ozone scmcli close 2 > ... > client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f: receive > RaftClientReply:client-09830A377AA9->f27bf787-8711-41d4-b0fd-3ef50b5c076f@group-7831D6F2EF1B, > cid=0, SUCCESS, logIndex=11, > commits[f27bf787-8711-41d4-b0fd-3ef50b5c076f:c12, > 37ba33fe-c9ed-4ac2-a6e5-57ce658168b4:c11, > feb68ba4-0a8a-4eda-9915-7dc090e5f46c:c11] > Failed to update container state #2, reason: invalid state transition from > state: CLOSED upon event: CLOSE. > $ ozone scmcli info 2 > ... > Container State: CLOSED > ... > {code} > {code:title=logs} > scm_1 | 2019-08-09 15:15:01 [IPC Server handler 1 on 9860] INFO > SCMClientProtocolServer:366 - Object type container id 1 op close new stage > begin > dn3_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > dn1_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > scm_1 | 2019-08-09 15:15:02 > [EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] > INFO IncrementalContainerReportHandler:176 - Moving container #1 to CLOSED > state, datanode feb68ba4-0a8a-4eda-9915-7dc090e5f46c{ip: 10.5.1.6, host: > ozone-static_dn3_1.ozone-static_net, networkLocation: /default-rack, > certSerialId: null} reported CLOSED replica. > dn2_1 | 2019-08-09 15:15:02 [RatisApplyTransactionExecutor 1] INFO > Container:356 - Container 1 is closed with bcsId 3. > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO > SCMClientProtocolServer:366 - Object type container id 1 op close new stage > complete > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] ERROR > ContainerStateManager:335 - Failed to update container state #1, reason: > invalid state transition from state: CLOSED upon event: CLOSE. > scm_1 | 2019-08-09 15:15:02 [IPC Server handler 3 on 9860] INFO Server:2726 > - IPC Server handler 3 on 9860, call Call#3 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.notifyObjectStageChange > from 10.5.0.71:57746 > scm_1 | org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update > container state #1, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > scm_1 | at > org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:336) > scm_1 | at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:312) > scm_1 | at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:379) > scm_1 | at > org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:219) > scm_1 | at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16398) > scm_1 | at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973855#comment-16973855 ] Akira Ajisaka commented on HDFS-14983: -- HDFS-14545 is to refresh the configs of the NameNodes via DFSRouter. I'd like to refresh the configs of the DFSRouter itself. > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Minor > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2308) Switch to centos with the apache/ozone-build docker image
[ https://issues.apache.org/jira/browse/HDDS-2308?focusedWorklogId=343040=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343040 ] ASF GitHub Bot logged work on HDDS-2308: Author: ASF GitHub Bot Created on: 14/Nov/19 00:57 Start Date: 14/Nov/19 00:57 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #9: HDDS-2308. Switch to centos with the apache/ozone-build docker image URL: https://github.com/apache/hadoop-docker-ozone/pull/9 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343040) Time Spent: 20m (was: 10m) > Switch to centos with the apache/ozone-build docker image > - > > Key: HDDS-2308 > URL: https://issues.apache.org/jira/browse/HDDS-2308 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: hs_err_pid16346.log > > Time Spent: 20m > Remaining Estimate: 0h > > I realized multiple JVM crashes in the daily builds: > > {code:java} > ERROR] ExecutionException The forked VM terminated without properly saying > goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp > surefire_1011197392458143645283tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp > > > [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp > surefire_933162535733309260236tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 {code} > > Based on the crash log (uploaded) it's related to the
[jira] [Resolved] (HDDS-2308) Switch to centos with the apache/ozone-build docker image
[ https://issues.apache.org/jira/browse/HDDS-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2308. Fix Version/s: 0.5.0 Resolution: Fixed Committed to the build branch. > Switch to centos with the apache/ozone-build docker image > - > > Key: HDDS-2308 > URL: https://issues.apache.org/jira/browse/HDDS-2308 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: hs_err_pid16346.log > > Time Spent: 20m > Remaining Estimate: 0h > > I realized multiple JVM crashes in the daily builds: > > {code:java} > ERROR] ExecutionException The forked VM terminated without properly saying > goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp > surefire_1011197392458143645283tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp > > > [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp > surefire_933162535733309260236tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 {code} > > Based on the crash log (uploaded) it's related to the rocksdb JNI interface. > In the current ozone-build docker image (which provides the environment for > build) we use alpine where musl libc is used instead of the main glibc. I > think it would be more safe to use the same glibc what is used in production. > I tested with centos based docker image and it seems to be more stable. > Didn't see any more JVM crashes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973838#comment-16973838 ] Hadoop QA commented on HDFS-14973: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 50s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 825 unchanged - 1 fixed = 826 total (was 826) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.tools.TestDFSZKFailoverController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14973 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985792/HDFS-14973.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c687d780da15 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73a386a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28309/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit |
[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973836#comment-16973836 ] Bharat Viswanadham commented on HDDS-2356: -- Hi [~timmylicheng] Thanks for sharing the logs. I see completeMultipartUpload is called with 286 parts, and OM is throwing an error InvalidPart, but from an audit log, I was not able to know which part is missing in OM. (And I see 286 success commit Multipart upload for the key). I think there might be a chance of the scenario HDDS-2477 we are hitting here. (Not completely sure, this is my analysis after looking up logs) I have opened couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not) > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, > image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, > om_audit_log_plc_1570863541668_9278.txt > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at >
[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973836#comment-16973836 ] Bharat Viswanadham edited comment on HDDS-2356 at 11/14/19 12:41 AM: - Hi [~timmylicheng] Thanks for sharing the logs. I see completeMultipartUpload is called with 286 parts, and OM is throwing an error InvalidPart, but from an audit log, I was not able to know which part is missing in OM(because we don't print any such info in log/exception message). (And I see 286 success commit Multipart upload for the key). I think there might be a chance of the scenario HDDS-2477 we are hitting here. (Not completely sure, this is my analysis after looking up logs) I have opened couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not) was (Author: bharatviswa): Hi [~timmylicheng] Thanks for sharing the logs. I see completeMultipartUpload is called with 286 parts, and OM is throwing an error InvalidPart, but from an audit log, I was not able to know which part is missing in OM. (And I see 286 success commit Multipart upload for the key). I think there might be a chance of the scenario HDDS-2477 we are hitting here. (Not completely sure, this is my analysis after looking up logs) I have opened couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not) > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, > image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, > om_audit_log_plc_1570863541668_9278.txt > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at >
[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA
[ https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2477: - Component/s: Ozone Manager > TableCache cleanup issue for OM non-HA > -- > > Key: HDDS-2477 > URL: https://issues.apache.org/jira/browse/HDDS-2477 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In OM in non-HA case, the ratisTransactionLogIndex is generated by > OmProtocolServersideTranslatorPB.java. And in OM non-HA > validateAndUpdateCache is called from multipleHandler threads. So think of a > case where one thread which has an index - 10 has added to doubleBuffer. (0-9 > still have not added). DoubleBuffer flush thread flushes and call cleanup. > (So, now cleanup will go and cleanup all cache entries with less than 10 > epoch) This should not have cleanup those which might have put in to cache > later and which are in process of flush to DB. This will cause inconsitency > for few OM requests. > > > Example: > 4 threads Committing 4 parts. > 1st thread - part 1 - ratis Index - 3 > 2nd thread - part 2 - ratis index - 2 > 3rd thread - part3 - ratis index - 1 > > First thread got lock, and put in to doubleBuffer and cache with > OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in > cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 > parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread > might cleanup those entries, as it is called with index 3 for cleanup. > > Now when the 4th part upload came -> when it is commit Multipart upload when > it gets multipartinfo it get Only part1 in OmMultipartInfo, as the > OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now > after 4th part upload is complete in DB and Cache we will have 1,4 parts > only. We will miss part2,3 information. > > So for non-HA case cleanup will be called with list of epochs that need to be > cleanedup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA
[ https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2477: - Status: Patch Available (was: Open) > TableCache cleanup issue for OM non-HA > -- > > Key: HDDS-2477 > URL: https://issues.apache.org/jira/browse/HDDS-2477 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In OM in non-HA case, the ratisTransactionLogIndex is generated by > OmProtocolServersideTranslatorPB.java. And in OM non-HA > validateAndUpdateCache is called from multipleHandler threads. So think of a > case where one thread which has an index - 10 has added to doubleBuffer. (0-9 > still have not added). DoubleBuffer flush thread flushes and call cleanup. > (So, now cleanup will go and cleanup all cache entries with less than 10 > epoch) This should not have cleanup those which might have put in to cache > later and which are in process of flush to DB. This will cause inconsitency > for few OM requests. > > > Example: > 4 threads Committing 4 parts. > 1st thread - part 1 - ratis Index - 3 > 2nd thread - part 2 - ratis index - 2 > 3rd thread - part3 - ratis index - 1 > > First thread got lock, and put in to doubleBuffer and cache with > OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in > cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 > parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread > might cleanup those entries, as it is called with index 3 for cleanup. > > Now when the 4th part upload came -> when it is commit Multipart upload when > it gets multipartinfo it get Only part1 in OmMultipartInfo, as the > OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now > after 4th part upload is complete in DB and Cache we will have 1,4 parts > only. We will miss part2,3 information. > > So for non-HA case cleanup will be called with list of epochs that need to be > cleanedup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2477) TableCache cleanup issue for OM non-HA
[ https://issues.apache.org/jira/browse/HDDS-2477?focusedWorklogId=343028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343028 ] ASF GitHub Bot logged work on HDDS-2477: Author: ASF GitHub Bot Created on: 14/Nov/19 00:24 Start Date: 14/Nov/19 00:24 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #159: HDDS-2477. TableCache cleanup issue for OM non-HA.: URL: https://github.com/apache/hadoop-ozone/pull/159 ## What changes were proposed in this pull request? In OM in non-HA case, the ratisTransactionLogIndex is generated by OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache is called from multipleHandler threads. So think of a case where one thread which has an index - 10 has added to doubleBuffer. (0-9 still have not added). DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go and cleanup all cache entries with less than 10 epoch) This should not have cleanup those which might have put in to cache later and which are in process of flush to DB. This will cause inconsitency for few OM requests. Example: 4 threads Committing 4 parts. 1st thread - part 1 - ratis Index - 3 2nd thread - part 2 - ratis index - 2 3rd thread - part3 - ratis index - 1 First thread got lock, and put in to doubleBuffer and cache with OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread might cleanup those entries, as it is called with index 3 for cleanup. Now when the 4th part upload came -> when it is commit Multipart upload when it gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now after 4th part upload is complete in DB and Cache we will have 1,4 parts only. We will miss part2,3 information. So for non-HA case cleanup will be called with list of epochs that need to be cleanedup. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2477 ## How was this patch tested? Added UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343028) Remaining Estimate: 0h Time Spent: 10m > TableCache cleanup issue for OM non-HA > -- > > Key: HDDS-2477 > URL: https://issues.apache.org/jira/browse/HDDS-2477 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In OM in non-HA case, the ratisTransactionLogIndex is generated by > OmProtocolServersideTranslatorPB.java. And in OM non-HA > validateAndUpdateCache is called from multipleHandler threads. So think of a > case where one thread which has an index - 10 has added to doubleBuffer. (0-9 > still have not added). DoubleBuffer flush thread flushes and call cleanup. > (So, now cleanup will go and cleanup all cache entries with less than 10 > epoch) This should not have cleanup those which might have put in to cache > later and which are in process of flush to DB. This will cause inconsitency > for few OM requests. > > > Example: > 4 threads Committing 4 parts. > 1st thread - part 1 - ratis Index - 3 > 2nd thread - part 2 - ratis index - 2 > 3rd thread - part3 - ratis index - 1 > > First thread got lock, and put in to doubleBuffer and cache with > OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in > cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 > parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread > might cleanup those entries, as it is called with index 3 for cleanup. > > Now when the 4th part upload came -> when it is commit Multipart upload when > it gets multipartinfo it get Only part1 in OmMultipartInfo, as the > OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now > after 4th part upload is complete in DB and Cache we will have 1,4 parts > only. We will miss part2,3 information. > > So for non-HA case cleanup will be called with list of epochs that need to be >
[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA
[ https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2477: - Labels: pull-request-available (was: ) > TableCache cleanup issue for OM non-HA > -- > > Key: HDDS-2477 > URL: https://issues.apache.org/jira/browse/HDDS-2477 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > > In OM in non-HA case, the ratisTransactionLogIndex is generated by > OmProtocolServersideTranslatorPB.java. And in OM non-HA > validateAndUpdateCache is called from multipleHandler threads. So think of a > case where one thread which has an index - 10 has added to doubleBuffer. (0-9 > still have not added). DoubleBuffer flush thread flushes and call cleanup. > (So, now cleanup will go and cleanup all cache entries with less than 10 > epoch) This should not have cleanup those which might have put in to cache > later and which are in process of flush to DB. This will cause inconsitency > for few OM requests. > > > Example: > 4 threads Committing 4 parts. > 1st thread - part 1 - ratis Index - 3 > 2nd thread - part 2 - ratis index - 2 > 3rd thread - part3 - ratis index - 1 > > First thread got lock, and put in to doubleBuffer and cache with > OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in > cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 > parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread > might cleanup those entries, as it is called with index 3 for cleanup. > > Now when the 4th part upload came -> when it is commit Multipart upload when > it gets multipartinfo it get Only part1 in OmMultipartInfo, as the > OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now > after 4th part upload is complete in DB and Cache we will have 1,4 parts > only. We will miss part2,3 information. > > So for non-HA case cleanup will be called with list of epochs that need to be > cleanedup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2477) TableCache cleanup issue for OM non-HA
Bharat Viswanadham created HDDS-2477: Summary: TableCache cleanup issue for OM non-HA Key: HDDS-2477 URL: https://issues.apache.org/jira/browse/HDDS-2477 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham In OM in non-HA case, the ratisTransactionLogIndex is generated by OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache is called from multipleHandler threads. So think of a case where one thread which has an index - 10 has added to doubleBuffer. (0-9 still have not added). DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go and cleanup all cache entries with less than 10 epoch) This should not have cleanup those which might have put in to cache later and which are in process of flush to DB. This will cause inconsitency for few OM requests. Example: 4 threads Committing 4 parts. 1st thread - part 1 - ratis Index - 3 2nd thread - part 2 - ratis index - 2 3rd thread - part3 - ratis index - 1 First thread got lock, and put in to doubleBuffer and cache with OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread might cleanup those entries, as it is called with index 3 for cleanup. Now when the 4th part upload came -> when it is commit Multipart upload when it gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now after 4th part upload is complete in DB and Cache we will have 1,4 parts only. We will miss part2,3 information. So for non-HA case cleanup will be called with list of epochs that need to be cleanedup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata
[ https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia updated HDDS-2469: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~adoroszlai] for the contribution, [~aengineer] for review. > Avoid changing client-side key metadata > --- > > Key: HDDS-2469 > URL: https://issues.apache.org/jira/browse/HDDS-2469 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2469) Avoid changing client-side key metadata
[ https://issues.apache.org/jira/browse/HDDS-2469?focusedWorklogId=343012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343012 ] ASF GitHub Bot logged work on HDDS-2469: Author: ASF GitHub Bot Created on: 13/Nov/19 23:47 Start Date: 13/Nov/19 23:47 Worklog Time Spent: 10m Work Description: dineshchitlangia commented on pull request #154: HDDS-2469. Avoid changing client-side key metadata URL: https://github.com/apache/hadoop-ozone/pull/154 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343012) Time Spent: 20m (was: 10m) > Avoid changing client-side key metadata > --- > > Key: HDDS-2469 > URL: https://issues.apache.org/jira/browse/HDDS-2469 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1889) Add support for verifying multiline log entry
[ https://issues.apache.org/jira/browse/HDDS-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Pogde updated HDDS-1889: - Attachment: image.png > Add support for verifying multiline log entry > - > > Key: HDDS-1889 > URL: https://issues.apache.org/jira/browse/HDDS-1889 > Project: Hadoop Distributed Data Store > Issue Type: Test > Components: test >Reporter: Dinesh Chitlangia >Priority: Major > Labels: newbie > Attachments: image.png > > > This jira aims to test the failure scenario where a multi-line stack trace > will be added in the audit log. Currently, for test assumes that even in > failure scenario we don't have multi-line log entry. > Example: > {code:java} > private static final AuditMessage READ_FAIL_MSG = > new AuditMessage.Builder() > .setUser("john") > .atIp("192.168.0.1") > .forOperation(DummyAction.READ_VOLUME.name()) > .withParams(PARAMS) > .withResult(FAILURE) > .withException(null).build(); > {code} > Therefore in verifyLog() we only compare for first line of the log file with > the expected message. > The test would fail if in future someone were to create a scenario with > multi-line log entry. > 1. Update READ_FAIL_MSG so that it has multiple lines of Exception stack > trace. > This is what multi-line log entry could look like: > {code:java} > ERROR | OMAudit | user=dchitlangia | ip=127.0.0.1 | op=GET_ACL > {volume=volume80100, bucket=bucket83878, key=null, aclType=CREATE, > resourceType=volume, storeType=ozone} | ret=FAILURE > org.apache.hadoop.ozone.om.exceptions.OMException: User dchitlangia doesn't > have CREATE permission to access volume > at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1809) > ~[classes/:?] > at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1769) > ~[classes/:?] > at > org.apache.hadoop.ozone.om.OzoneManager.createBucket(OzoneManager.java:2092) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.createBucket(OzoneManagerRequestHandler.java:526) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handle(OzoneManagerRequestHandler.java:185) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:192) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:110) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > ~[classes/:?] > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > ~[hadoop-common-3.2.0.jar:?] > at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > ~[hadoop-common-3.2.0.jar:?] > {code} > 2. Update verifyLog method to accept variable number of arguments. > 3. Update the assertion so that it compares beyond the first line when the > expected is a multi-line log entry. > {code:java} > assertTrue(expected.equalsIgnoreCase(lines.get(0))); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
[ https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=342996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342996 ] ASF GitHub Bot logged work on HDDS-2472: Author: ASF GitHub Bot Created on: 13/Nov/19 23:27 Start Date: 13/Nov/19 23:27 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #158: HDDS-2472. Use try-with-resources while creating FlushOptions in RDBS… URL: https://github.com/apache/hadoop-ozone/pull/158 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342996) Time Spent: 20m (was: 10m) > Use try-with-resources while creating FlushOptions in RDBStore. > --- > > Key: HDDS-2472 > URL: https://issues.apache.org/jira/browse/HDDS-2472 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Link to the sonar issue flag - > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
[ https://issues.apache.org/jira/browse/HDDS-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2472: - Labels: pull-request-available (was: ) > Use try-with-resources while creating FlushOptions in RDBStore. > --- > > Key: HDDS-2472 > URL: https://issues.apache.org/jira/browse/HDDS-2472 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > > Link to the sonar issue flag - > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
[ https://issues.apache.org/jira/browse/HDDS-2472?focusedWorklogId=342995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342995 ] ASF GitHub Bot logged work on HDDS-2472: Author: ASF GitHub Bot Created on: 13/Nov/19 23:26 Start Date: 13/Nov/19 23:26 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #158: HDDS-2472. Use try-with-resources while creating FlushOptions in RDBS… URL: https://github.com/apache/hadoop-ozone/pull/158 …tore. ## What changes were proposed in this pull request? Use try-with-resources while creating FlushOptions in RDBStore class. Remove code duplication in getCheckpoint method. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2472 ## How was this patch tested? Unit tested. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342995) Remaining Estimate: 0h Time Spent: 10m > Use try-with-resources while creating FlushOptions in RDBStore. > --- > > Key: HDDS-2472 > URL: https://issues.apache.org/jira/browse/HDDS-2472 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Link to the sonar issue flag - > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2474) Remove OzoneClient exception Precondition check
[ https://issues.apache.org/jira/browse/HDDS-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-2474: - Status: Patch Available (was: Open) > Remove OzoneClient exception Precondition check > --- > > Key: HDDS-2474 > URL: https://issues.apache.org/jira/browse/HDDS-2474 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If RaftCleintReply encounters an exception other than NotLeaderException, > NotReplicatedException, StateMachineException or LeaderNotReady, then it sets > success to false but there is no exception set. This causes a Precondition > check failure in XceiverClientRatis which expects that there should be an > exception if success=false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2474) Remove OzoneClient exception Precondition check
[ https://issues.apache.org/jira/browse/HDDS-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2474: - Labels: pull-request-available (was: ) > Remove OzoneClient exception Precondition check > --- > > Key: HDDS-2474 > URL: https://issues.apache.org/jira/browse/HDDS-2474 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > > If RaftCleintReply encounters an exception other than NotLeaderException, > NotReplicatedException, StateMachineException or LeaderNotReady, then it sets > success to false but there is no exception set. This causes a Precondition > check failure in XceiverClientRatis which expects that there should be an > exception if success=false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2474) Remove OzoneClient exception Precondition check
[ https://issues.apache.org/jira/browse/HDDS-2474?focusedWorklogId=342994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342994 ] ASF GitHub Bot logged work on HDDS-2474: Author: ASF GitHub Bot Created on: 13/Nov/19 23:24 Start Date: 13/Nov/19 23:24 Worklog Time Spent: 10m Work Description: hanishakoneru commented on pull request #157: HDDS-2474. Remove OzoneClient exception Precondition check. URL: https://github.com/apache/hadoop-ozone/pull/157 ## What changes were proposed in this pull request? If RaftCleintReply encounters an exception other than NotLeaderException, NotReplicatedException, StateMachineException or LeaderNotReady, then it sets success to false but there is no exception set. This causes a Precondition check failure in XceiverClientRatis which expects that there should be an exception if success=false. This Jira proposes to remove the Precondition check. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2474 ## How was this patch tested? This patch does not require a unit test. We are just removing a Precondition check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342994) Remaining Estimate: 0h Time Spent: 10m > Remove OzoneClient exception Precondition check > --- > > Key: HDDS-2474 > URL: https://issues.apache.org/jira/browse/HDDS-2474 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If RaftCleintReply encounters an exception other than NotLeaderException, > NotReplicatedException, StateMachineException or LeaderNotReady, then it sets > success to false but there is no exception set. This causes a Precondition > check failure in XceiverClientRatis which expects that there should be an > exception if success=false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2476) Share more code between metadata and data scanners
Attila Doroszlai created HDDS-2476: -- Summary: Share more code between metadata and data scanners Key: HDDS-2476 URL: https://issues.apache.org/jira/browse/HDDS-2476 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Attila Doroszlai There are several duplicated / similar pieces of code in metadata and data scanners. More code should be reused. Examples: # ContainerDataScrubberMetrics and ContainerMetadataScrubberMetrics have 3 common metrics # lifecycle of ContainerMetadataScanner and ContainerDataScanner (main loop, iteration, metrics processing, shutdown) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2475) Unregister ContainerMetadataScrubberMetrics on thread exit
Attila Doroszlai created HDDS-2475: -- Summary: Unregister ContainerMetadataScrubberMetrics on thread exit Key: HDDS-2475 URL: https://issues.apache.org/jira/browse/HDDS-2475 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Attila Doroszlai {{ContainerMetadataScanner}} thread should call {{ContainerMetadataScrubberMetrics#unregister}} before exiting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14283) DFSInputStream to prefer cached replica
[ https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973771#comment-16973771 ] Siyao Meng commented on HDFS-14283: --- [~leosun08] Code looks good. Would you add test cases in {{TestDFSInputStream}} to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to true, a cached block will be used; 2) when the config is set to false, it won't affect the current behavior of the client without the patch. Thanks! > DFSInputStream to prefer cached replica > --- > > Key: HDFS-14283 > URL: https://issues.apache.org/jira/browse/HDFS-14283 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.6.0 > Environment: HDFS Caching >Reporter: Wei-Chiu Chuang >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, > HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch > > > HDFS Caching offers performance benefits. However, currently NameNode does > not treat cached replica with higher priority, so HDFS caching is only useful > when cache replication = 3, that is to say, all replicas are cached in > memory, so that a client doesn't randomly pick an uncached replica. > HDFS-6846 proposed to let NameNode give higher priority to cached replica. > Changing a logic in NameNode is always tricky so that didn't get much > traction. Here I propose a different approach: let client (DFSInputStream) > prefer cached replica. > A {{LocatedBlock}} object already contains cached replica location so a > client has the needed information. I think we can change > {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14283) DFSInputStream to prefer cached replica
[ https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973771#comment-16973771 ] Siyao Meng edited comment on HDFS-14283 at 11/13/19 10:51 PM: -- [~leosun08] Thanks! Code looks good. Would you add test cases in {{TestDFSInputStream}} to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to true, a cached block will be used; 2) when the config is set to false, it won't affect the current behavior of the client without the patch. And would you address the checkstyle? was (Author: smeng): [~leosun08] Code looks good. Would you add test cases in {{TestDFSInputStream}} to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to true, a cached block will be used; 2) when the config is set to false, it won't affect the current behavior of the client without the patch. Thanks! > DFSInputStream to prefer cached replica > --- > > Key: HDFS-14283 > URL: https://issues.apache.org/jira/browse/HDFS-14283 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.6.0 > Environment: HDFS Caching >Reporter: Wei-Chiu Chuang >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, > HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch > > > HDFS Caching offers performance benefits. However, currently NameNode does > not treat cached replica with higher priority, so HDFS caching is only useful > when cache replication = 3, that is to say, all replicas are cached in > memory, so that a client doesn't randomly pick an uncached replica. > HDFS-6846 proposed to let NameNode give higher priority to cached replica. > Changing a logic in NameNode is always tricky so that didn't get much > traction. Here I propose a different approach: let client (DFSInputStream) > prefer cached replica. > A {{LocatedBlock}} object already contains cached replica location so a > client has the needed information. I think we can change > {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2474) Remove OzoneClient exception Precondition check
Hanisha Koneru created HDDS-2474: Summary: Remove OzoneClient exception Precondition check Key: HDDS-2474 URL: https://issues.apache.org/jira/browse/HDDS-2474 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Hanisha Koneru Assignee: Hanisha Koneru If RaftCleintReply encounters an exception other than NotLeaderException, NotReplicatedException, StateMachineException or LeaderNotReady, then it sets success to false but there is no exception set. This causes a Precondition check failure in XceiverClientRatis which expects that there should be an exception if success=false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent
[ https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-1847. Fix Version/s: 0.5.0 Resolution: Fixed [~chris.t...@gmail.com] Thanks for the contribution. [~elek] Thanks for retesting this patch. I have committed this change to the master branch. > Datanode Kerberos principal and keytab config key looks inconsistent > > > Key: HDDS-1847 > URL: https://issues.apache.org/jira/browse/HDDS-1847 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Eric Yang >Assignee: Chris Teoh >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Ozone Kerberos configuration can be very confusing: > | config name | Description | > | hdds.scm.kerberos.principal | SCM service principal | > | hdds.scm.kerberos.keytab.file | SCM service keytab file | > | ozone.om.kerberos.principal | Ozone Manager service principal | > | ozone.om.kerberos.keytab.file | Ozone Manager keytab file | > | hdds.scm.http.kerberos.principal | SCM service spnego principal | > | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file | > | ozone.om.http.kerberos.principal | Ozone Manager spnego principal | > | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file | > | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file | > | hdds.datanode.http.kerberos.principal | Datanode spnego principal | > | dfs.datanode.kerberos.principal | Datanode service principal | > | dfs.datanode.keytab.file | Datanode service keytab file | > The prefix are very different for each of the datanode configuration. It > would be nice to have some consistency for datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2473: Description: sonarcloud.io has flagged a number of code reliability issues in Ozone recon (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). Following issues will be triaged / fixed. * Double Brace Initialization should not be used * Resources should be closed * InterruptedException should not be ignored > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Fix For: 0.5.0 > > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent
[ https://issues.apache.org/jira/browse/HDDS-1847?focusedWorklogId=342947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342947 ] ASF GitHub Bot logged work on HDDS-1847: Author: ASF GitHub Bot Created on: 13/Nov/19 22:01 Start Date: 13/Nov/19 22:01 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #115: HDDS-1847: Datanode Kerberos principal and keytab config key looks inconsistent URL: https://github.com/apache/hadoop-ozone/pull/115 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342947) Time Spent: 1h 40m (was: 1.5h) > Datanode Kerberos principal and keytab config key looks inconsistent > > > Key: HDDS-1847 > URL: https://issues.apache.org/jira/browse/HDDS-1847 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Eric Yang >Assignee: Chris Teoh >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Ozone Kerberos configuration can be very confusing: > | config name | Description | > | hdds.scm.kerberos.principal | SCM service principal | > | hdds.scm.kerberos.keytab.file | SCM service keytab file | > | ozone.om.kerberos.principal | Ozone Manager service principal | > | ozone.om.kerberos.keytab.file | Ozone Manager keytab file | > | hdds.scm.http.kerberos.principal | SCM service spnego principal | > | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file | > | ozone.om.http.kerberos.principal | Ozone Manager spnego principal | > | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file | > | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file | > | hdds.datanode.http.kerberos.principal | Datanode spnego principal | > | dfs.datanode.kerberos.principal | Datanode service principal | > | dfs.datanode.keytab.file | Datanode service keytab file | > The prefix are very different for each of the datanode configuration. It > would be nice to have some consistency for datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?focusedWorklogId=342941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342941 ] ASF GitHub Bot logged work on HDDS-2364: Author: ASF GitHub Bot Created on: 13/Nov/19 21:59 Start Date: 13/Nov/19 21:59 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #101: HDDS-2364. Add OM metrics to find the false positive rate for the keyMayExist. URL: https://github.com/apache/hadoop-ozone/pull/101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342941) Time Spent: 20m (was: 10m) > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2364. Fix Version/s: 0.5.0 Resolution: Fixed [~avijayan] Thanks for the contribution. [~bharat] Thanks for the reviews. I have committed this to the master branch. > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
Aravindan Vijayan created HDDS-2473: --- Summary: Fix code reliability issues found by Sonar in Ozone Recon module. Key: HDDS-2473 URL: https://issues.apache.org/jira/browse/HDDS-2473 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Recon Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973732#comment-16973732 ] Erik Krogen edited comment on HDFS-14973 at 11/13/19 9:55 PM: -- For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, {{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make use of the {{NameNodeConnector}}). I'm not sure if this is desirable or not, any thoughts on that? This is why I passed it in externally. For (2), I took a look, as I generally am in agreement with you that not recreating miniclusters is a good thing. I don't see a good way to do it here, since {{doTest}} makes use of the ability to configure the exact capacities of the DataNodes it starts up in the new cluster. I tried refactoring {{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very difficult to properly fill the cluster _only_ on the original DataNodes and not the new DataNode, leaving a single one empty. Let me know if you have really strong feelings about this and I can try some more, but for now it seems like it would require some substantial refactoring of {{TestBalanacer}}. For running against the default parameter of 20, it wasn't possible with the existing code since it asserts that the number of getBlocks calls is _higher_ than the rate limit. I changed it to greater-than-or-equal-to instead of a strict greater-than to get it to work with the default of 20. Uploaded v002 patch with this change. was (Author: xkrogen): For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, {{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make use of the {{NameNodeConnector}}. I'm not sure if this is desirable or not, any thoughts on that? This is why I passed it in externally. For (2), I took a look, as I generally am in agreement with you that not recreating miniclusters is a good thing. I don't see a good way to do it here, since {{doTest}} makes use of the ability to configure the exact capacities of the DataNodes it starts up in the new cluster. I tried refactoring {{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very difficult to properly fill the cluster _only_ on the original DataNodes and not the new DataNode, leaving a single one empty. Let me know if you have really strong feelings about this and I can try some more, but for now it seems like it would require some substantial refactoring of {{TestBalanacer}}. For running against the default parameter of 20, it wasn't possible with the existing code since it asserts that the number of getBlocks calls is _higher_ than the rate limit. I changed it to greater-than-or-equal-to instead of a strict greater-than to get it to work with the default of 20. Uploaded v002 patch with this change. > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), >
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973732#comment-16973732 ] Erik Krogen commented on HDFS-14973: For (1), if we do this it will force the {{ExternalStoragePolicySatisfier}}, {{Mover}}, and {{Balancer}} all to use the same rate limiting (as they all make use of the {{NameNodeConnector}}. I'm not sure if this is desirable or not, any thoughts on that? This is why I passed it in externally. For (2), I took a look, as I generally am in agreement with you that not recreating miniclusters is a good thing. I don't see a good way to do it here, since {{doTest}} makes use of the ability to configure the exact capacities of the DataNodes it starts up in the new cluster. I tried refactoring {{TestBalancer}} a bit to make it so that there is a {{setupMiniCluster}} step followed by a {{doTestWithoutClusterSetup}} step, but then it becomes very difficult to properly fill the cluster _only_ on the original DataNodes and not the new DataNode, leaving a single one empty. Let me know if you have really strong feelings about this and I can try some more, but for now it seems like it would require some substantial refactoring of {{TestBalanacer}}. For running against the default parameter of 20, it wasn't possible with the existing code since it asserts that the number of getBlocks calls is _higher_ than the rate limit. I changed it to greater-than-or-equal-to instead of a strict greater-than to get it to work with the default of 20. Uploaded v002 patch with this change. > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14973: --- Attachment: HDFS-14973.002.patch > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
Aravindan Vijayan created HDDS-2472: --- Summary: Use try-with-resources while creating FlushOptions in RDBStore. Key: HDDS-2472 URL: https://issues.apache.org/jira/browse/HDDS-2472 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 Link to the sonar issue flag - https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2412) Define description/topics/merge strategy for the github repository with .asf.yaml
[ https://issues.apache.org/jira/browse/HDDS-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2412. Fix Version/s: 0.5.0 Resolution: Fixed Thanks, I have committed this patch to the master. [~elek] Thanks for the contribution. [~adoroszlai] Thanks for the reviews. > Define description/topics/merge strategy for the github repository with > .asf.yaml > - > > Key: HDDS-2412 > URL: https://issues.apache.org/jira/browse/HDDS-2412 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > .asf.yaml helps to set different parameters on github repositories without > admin privileges: > [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories] > This basic .asf.yaml defines description/url/topics and the allowed merge > buttons. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2412) Define description/topics/merge strategy for the github repository with .asf.yaml
[ https://issues.apache.org/jira/browse/HDDS-2412?focusedWorklogId=342931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342931 ] ASF GitHub Bot logged work on HDDS-2412: Author: ASF GitHub Bot Created on: 13/Nov/19 21:48 Start Date: 13/Nov/19 21:48 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #125: HDDS-2412. Define description/topics/merge strategy for the github repository URL: https://github.com/apache/hadoop-ozone/pull/125 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342931) Time Spent: 20m (was: 10m) > Define description/topics/merge strategy for the github repository with > .asf.yaml > - > > Key: HDDS-2412 > URL: https://issues.apache.org/jira/browse/HDDS-2412 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > .asf.yaml helps to set different parameters on github repositories without > admin privileges: > [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories] > This basic .asf.yaml defines description/url/topics and the allowed merge > buttons. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2400) Enable github actions based builds for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2400. Fix Version/s: 0.5.0 Resolution: Fixed Thanks, Committed to the master. > Enable github actions based builds for Ozone > > > Key: HDDS-2400 > URL: https://issues.apache.org/jira/browse/HDDS-2400 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Current PR checks are executed in a private branch based on the scripts in > [https://github.com/elek/argo-ozone] > but the results are stored in a public repositories: > [https://github.com/elek/ozone-ci-q4|https://github.com/elek/ozone-ci-q3] > [https://github.com/elek/ozone-ci-03] > > As we discussed during the community calls, it would be great to use github > actions (or any other cloud based build) to make all the build definitions > more accessible for the community. > [~vivekratnavel] checked CircleCI which has better reporting capabilities. > But INFRA has concerns about the permission model of circle-ci: > {quote}it is highly unlikley we will allow a bot to be able to commit code > (whether or not that is the intention, allowing circle-ci will make this > possible, and is a complete no) > {quote} > See: > https://issues.apache.org/jira/browse/INFRA-18131 > [https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E] > > Fortunately we have a clear contract. Or build scripts are stored under > _hadoop-ozone/dev-support/checks_ (return code show the result, details are > printed out to the console output). It's very easy to experiment with > different build systems. > > Github action seems to be an obvious choice: it's integrated well with GitHub > and it has more generous resource limitations. > > With this Jira I propose to enable github actions based PR checks for a few > tests (author, rat, unit, acceptance, checkstyle, findbugs) as an experiment. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2400) Enable github actions based builds for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2400?focusedWorklogId=342929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342929 ] ASF GitHub Bot logged work on HDDS-2400: Author: ASF GitHub Bot Created on: 13/Nov/19 21:45 Start Date: 13/Nov/19 21:45 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #122: HDDS-2400. Enable github actions based builds for Ozone URL: https://github.com/apache/hadoop-ozone/pull/122 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342929) Time Spent: 20m (was: 10m) > Enable github actions based builds for Ozone > > > Key: HDDS-2400 > URL: https://issues.apache.org/jira/browse/HDDS-2400 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Current PR checks are executed in a private branch based on the scripts in > [https://github.com/elek/argo-ozone] > but the results are stored in a public repositories: > [https://github.com/elek/ozone-ci-q4|https://github.com/elek/ozone-ci-q3] > [https://github.com/elek/ozone-ci-03] > > As we discussed during the community calls, it would be great to use github > actions (or any other cloud based build) to make all the build definitions > more accessible for the community. > [~vivekratnavel] checked CircleCI which has better reporting capabilities. > But INFRA has concerns about the permission model of circle-ci: > {quote}it is highly unlikley we will allow a bot to be able to commit code > (whether or not that is the intention, allowing circle-ci will make this > possible, and is a complete no) > {quote} > See: > https://issues.apache.org/jira/browse/INFRA-18131 > [https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E] > > Fortunately we have a clear contract. Or build scripts are stored under > _hadoop-ozone/dev-support/checks_ (return code show the result, details are > printed out to the console output). It's very easy to experiment with > different build systems. > > Github action seems to be an obvious choice: it's integrated well with GitHub > and it has more generous resource limitations. > > With this Jira I propose to enable github actions based PR checks for a few > tests (author, rat, unit, acceptance, checkstyle, findbugs) as an experiment. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
[ https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973716#comment-16973716 ] Hanisha Koneru edited comment on HDDS-2392 at 11/13/19 9:30 PM: Thank you [~avijayan]. This issue is fixed by RATIS-747 was (Author: hanishakoneru): Fixed by RATIS-747 > Fix TestScmSafeMode#testSCMSafeModeRestrictedOp > --- > > Key: HDDS-2392 > URL: https://issues.apache.org/jira/browse/HDDS-2392 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Blocker > > After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp > fails as the DNs fail to restart XceiverServerRatis. > RaftServer#start() fails with following exception: > {code:java} > java.io.IOException: java.lang.IllegalStateException: Not started > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284) > at > org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException: Not started > at > org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176) > at > org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136) > at > org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70) > at > org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
[ https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru resolved HDDS-2392. -- Resolution: Fixed Fixed by RATIS-747 > Fix TestScmSafeMode#testSCMSafeModeRestrictedOp > --- > > Key: HDDS-2392 > URL: https://issues.apache.org/jira/browse/HDDS-2392 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Blocker > > After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp > fails as the DNs fail to restart XceiverServerRatis. > RaftServer#start() fails with following exception: > {code:java} > java.io.IOException: java.lang.IllegalStateException: Not started > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284) > at > org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException: Not started > at > org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176) > at > org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136) > at > org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70) > at > org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2471) Improve exception message for CompleteMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham reassigned HDDS-2471: Assignee: Bharat Viswanadham > Improve exception message for CompleteMultipartUpload > - > > Key: HDDS-2471 > URL: https://issues.apache.org/jira/browse/HDDS-2471 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When InvalidPart error occurs, the exception message does not have any > information about partName and partNumber, it will be good to have this > information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2471) Improve exception message for CompleteMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2471: - Status: Patch Available (was: Open) > Improve exception message for CompleteMultipartUpload > - > > Key: HDDS-2471 > URL: https://issues.apache.org/jira/browse/HDDS-2471 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When InvalidPart error occurs, the exception message does not have any > information about partName and partNumber, it will be good to have this > information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2471) Improve exception message for CompleteMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2471?focusedWorklogId=342876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342876 ] ASF GitHub Bot logged work on HDDS-2471: Author: ASF GitHub Bot Created on: 13/Nov/19 20:54 Start Date: 13/Nov/19 20:54 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #156: HDDS-2471. Improve exception message for CompleteMultipartUpload. URL: https://github.com/apache/hadoop-ozone/pull/156 ## What changes were proposed in this pull request? Add partName, partNumber to exception message when InvalidPart error occurs, this will help during debugging issue when reading logs. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2471 Please replace this section with the link to the Apache JIRA) ## How was this patch tested? Just added more information to exception message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342876) Remaining Estimate: 0h Time Spent: 10m > Improve exception message for CompleteMultipartUpload > - > > Key: HDDS-2471 > URL: https://issues.apache.org/jira/browse/HDDS-2471 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When InvalidPart error occurs, the exception message does not have any > information about partName and partNumber, it will be good to have this > information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2471) Improve exception message for CompleteMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2471: - Labels: pull-request-available (was: ) > Improve exception message for CompleteMultipartUpload > - > > Key: HDDS-2471 > URL: https://issues.apache.org/jira/browse/HDDS-2471 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > > When InvalidPart error occurs, the exception message does not have any > information about partName and partNumber, it will be good to have this > information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2471) Improve exception message for CompleteMultipartUpload
Bharat Viswanadham created HDDS-2471: Summary: Improve exception message for CompleteMultipartUpload Key: HDDS-2471 URL: https://issues.apache.org/jira/browse/HDDS-2471 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham When InvalidPart error occurs, the exception message does not have any information about partName and partNumber, it will be good to have this information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end
[ https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973661#comment-16973661 ] Surendra Singh Lilhore commented on HDFS-14985: --- [~Sushma_28], it is duplicate of HDFS-14987. You can work on HDFS-14987. > FSCK for a block of EC Files doesnt display status at the end > - > > Key: HDFS-14985 > URL: https://issues.apache.org/jira/browse/HDFS-14985 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Major > > *Environment* Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled > fsck -blockId of a block associated with an EC File does not print status at > the end and displays null instead > {color:#de350b}*Result :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > {color:#de350b}*Expected :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: vm10/default-rack is HEALTHY > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14979: --- Resolution: Fixed Status: Resolved (was: Patch Available) > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973656#comment-16973656 ] Erik Krogen edited comment on HDFS-14979 at 11/13/19 8:10 PM: -- Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, branch-2.10). Wasn't sure if I should do branch-2 given that I think we've decided 2.10 is the last 2.x release, but figured it couldn't hurt. Thanks for the review [~shv]! was (Author: xkrogen): Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, branch-2.10). Wasn't sure if I should do branch-2 given that I think we've decided 2.10 is the last 2.x release, but figured it couldn't hurt. > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14979: --- Fix Version/s: 2.11.0 2.10.1 3.2.2 3.1.4 3.3.0 > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973656#comment-16973656 ] Erik Krogen commented on HDFS-14979: Committed this down to branch-2.10 (trunk, branch-3.2, branch-3.1, branch-2, branch-2.10). Wasn't sure if I should do branch-2 given that I think we've decided 2.10 is the last 2.x release, but figured it couldn't hurt. > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14655) [SBN Read] Namenode crashes if one of The JN is down
[ https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973655#comment-16973655 ] Chen Liang commented on HDFS-14655: --- [~xkrogen] seems like a different message, looks to me that this one happens when a {{Future}} instance got cancelled. > [SBN Read] Namenode crashes if one of The JN is down > > > Key: HDFS-14655 > URL: https://issues.apache.org/jira/browse/HDFS-14655 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Critical > Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14655-01.patch, HDFS-14655-02.patch, > HDFS-14655-03.patch, HDFS-14655-04.patch, HDFS-14655-05.patch, > HDFS-14655-06.patch, HDFS-14655-07.patch, HDFS-14655-08.patch, > HDFS-14655-branch-2-01.patch, HDFS-14655-branch-2-02.patch, > HDFS-14655.poc.patch > > > {noformat} > 2019-07-04 17:35:54,064 | INFO | Logger channel (from parallel executor) to > XXX/XXX | Retrying connect to server: XXX/XXX. Already tried > 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) | Client.java:975 > 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered > while tailing edits. Shutting down standby NN. | EditLogTailer.java:474 > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) > at > com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440) > at > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565) > at > org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508) > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > 2019-07-04 17:35:54,112 | INFO | Edit log tailer | Exiting with status 1: > java.lang.OutOfMemoryError: unable to create new native thread | > ExitUtil.java:210 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973653#comment-16973653 ] Hadoop QA commented on HDFS-14924: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 622 unchanged - 0 fixed = 623 total (was 622) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 13s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14924 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985768/HDFS-14924.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 08f5ffcb26dc 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / df6b316 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28308/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | javadoc | https://builds.apache.org/job/PreCommit-HDFS-Build/28308/artifact/out/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt | | unit |
[jira] [Commented] (HDDS-2468) scmcli close pipeline command not working
[ https://issues.apache.org/jira/browse/HDDS-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973647#comment-16973647 ] Hanisha Koneru commented on HDDS-2468: -- Deactivate Pipeline is also failing with the same error. > scmcli close pipeline command not working > - > > Key: HDDS-2468 > URL: https://issues.apache.org/jira/browse/HDDS-2468 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > > Close pipeline command is failing with the following exception > {noformat} > java.lang.IllegalArgumentException: Unknown command type: ClosePipeline > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:219) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:29883) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973648#comment-16973648 ] Hudson commented on HDFS-14979: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17638 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17638/]) HDFS-14979 Allow Balancer to submit getBlocks calls to Observer Nodes (xkrogen: rev 586defe7113ed246ed0275bb3833882a3d873d70) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible
[ https://issues.apache.org/jira/browse/HDFS-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973625#comment-16973625 ] Konstantin Shvachko commented on HDFS-14979: +1 LGTM > [Observer Node] Balancer should submit getBlocks to Observer Node when > possible > --- > > Key: HDFS-14979 > URL: https://issues.apache.org/jira/browse/HDFS-14979 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, hdfs >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14979.000.patch > > > In HDFS-14162, we made it so that the Balancer could function when > {{ObserverReadProxyProvider}} was in use. However, the Balancer would still > read from the active NameNode, because {{getBlocks}} wasn't annotated as > {{@ReadOnly}}. This task is to enable the Balancer to actually read from the > Observer Node to alleviate load from the active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2470: - Description: Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. This will help in analyzing audit logs for MPU. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=xx.xx.xx.xx | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "xx.xx.xx.xx" hostName: "xx.xx.xx.xx" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | was: Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. This will help in analyzing audit logs for MPU. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "9.134.51.232" hostName: "9.134.51.232" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | > Add partName, partNumber for CommitMultipartUpload > -- > > Key: HDDS-2470 > URL: https://issues.apache.org/jira/browse/HDDS-2470 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now when complete Multipart Upload is not printing partName and > partNumber into the audit log. This will help in analyzing audit logs for MPU. > > > 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=xx.xx.xx.xx | > op=COMMIT_MULTIPART_UPLOAD_PARTKEY > {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, > key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, > replicationFactor=ONE, keyLocationInfo=[blockID { > containerBlockID > { containerID: 2 localID: 103129366531867089 } > blockCommitSequenceId: 4978 > } > offset: 0 > length: 5242880 > createVersion: 0 > pipeline
[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2470: - Status: Patch Available (was: Open) > Add partName, partNumber for CommitMultipartUpload > -- > > Key: HDDS-2470 > URL: https://issues.apache.org/jira/browse/HDDS-2470 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now when complete Multipart Upload is not printing partName and > partNumber into the audit log. This will help in analyzing audit logs for MPU. > > > 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | > op=COMMIT_MULTIPART_UPLOAD_PARTKEY > {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, > key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, > replicationFactor=ONE, keyLocationInfo=[blockID { > containerBlockID > { containerID: 2 localID: 103129366531867089 } > blockCommitSequenceId: 4978 > } > offset: 0 > length: 5242880 > createVersion: 0 > pipeline { > leaderID: "" > members { > uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" > ipAddress: "9.134.51.232" > hostName: "9.134.51.232" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" > networkLocation: "/default-rack" > } > members { > uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" > ipAddress: "9.134.51.25" > hostName: "9.134.51.25" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" > networkLocation: "/default-rack" > } > members { > uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > ipAddress: "9.134.51.215" > hostName: "9.134.51.215" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > networkLocation: "/default-rack" > } > state: PIPELINE_OPEN > type: RATIS > factor: THREE > id > { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } > } > ]} | ret=SUCCESS | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2470: - Labels: pull-request-available (was: ) > Add partName, partNumber for CommitMultipartUpload > -- > > Key: HDDS-2470 > URL: https://issues.apache.org/jira/browse/HDDS-2470 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > > Right now when complete Multipart Upload is not printing partName and > partNumber into the audit log. This will help in analyzing audit logs for MPU. > > > 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | > op=COMMIT_MULTIPART_UPLOAD_PARTKEY > {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, > key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, > replicationFactor=ONE, keyLocationInfo=[blockID { > containerBlockID > { containerID: 2 localID: 103129366531867089 } > blockCommitSequenceId: 4978 > } > offset: 0 > length: 5242880 > createVersion: 0 > pipeline { > leaderID: "" > members { > uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" > ipAddress: "9.134.51.232" > hostName: "9.134.51.232" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" > networkLocation: "/default-rack" > } > members { > uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" > ipAddress: "9.134.51.25" > hostName: "9.134.51.25" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" > networkLocation: "/default-rack" > } > members { > uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > ipAddress: "9.134.51.215" > hostName: "9.134.51.215" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > networkLocation: "/default-rack" > } > state: PIPELINE_OPEN > type: RATIS > factor: THREE > id > { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } > } > ]} | ret=SUCCESS | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2470?focusedWorklogId=342808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342808 ] ASF GitHub Bot logged work on HDDS-2470: Author: ASF GitHub Bot Created on: 13/Nov/19 19:20 Start Date: 13/Nov/19 19:20 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #155: HDDS-2470. Add partName, partNumber for CommitMultipartUpload. URL: https://github.com/apache/hadoop-ozone/pull/155 ## What changes were proposed in this pull request? Add partName and partName to audit log when logging commitMultipartUpload request. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2470 ## How was this patch tested? Just a log change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342808) Remaining Estimate: 0h Time Spent: 10m > Add partName, partNumber for CommitMultipartUpload > -- > > Key: HDDS-2470 > URL: https://issues.apache.org/jira/browse/HDDS-2470 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now when complete Multipart Upload is not printing partName and > partNumber into the audit log. This will help in analyzing audit logs for MPU. > > > 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | > op=COMMIT_MULTIPART_UPLOAD_PARTKEY > {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, > key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, > replicationFactor=ONE, keyLocationInfo=[blockID { > containerBlockID > { containerID: 2 localID: 103129366531867089 } > blockCommitSequenceId: 4978 > } > offset: 0 > length: 5242880 > createVersion: 0 > pipeline { > leaderID: "" > members { > uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" > ipAddress: "9.134.51.232" > hostName: "9.134.51.232" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" > networkLocation: "/default-rack" > } > members { > uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" > ipAddress: "9.134.51.25" > hostName: "9.134.51.25" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" > networkLocation: "/default-rack" > } > members { > uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > ipAddress: "9.134.51.215" > hostName: "9.134.51.215" > ports > { name: "RATIS" value: 9858 } > ports > { name: "STANDALONE" value: 9859 } > networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" > networkLocation: "/default-rack" > } > state: PIPELINE_OPEN > type: RATIS > factor: THREE > id > { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } > } > ]} | ret=SUCCESS | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973613#comment-16973613 ] Íñigo Goiri commented on HDFS-14983: It might be missing in the dfsrouteradmin though. You could use dfsadmin pointing to the router but it might be good to have it in dfsrouteradmin itself. > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Minor > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973612#comment-16973612 ] Íñigo Goiri commented on HDFS-14983: Isn't this done by HDFS-14545? > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Minor > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14955) RBF: getQuotaUsage() on mount point should return global quota.
[ https://issues.apache.org/jira/browse/HDFS-14955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973609#comment-16973609 ] Íñigo Goiri commented on HDFS-14955: The failed unit test is not related and it will be fixed by HDFS-14974. +1 on [^HDFS-14955.002.patch]. [~ayushtkn] do you mind taking a look? > RBF: getQuotaUsage() on mount point should return global quota. > --- > > Key: HDFS-14955 > URL: https://issues.apache.org/jira/browse/HDFS-14955 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14955.001.patch, HDFS-14955.002.patch > > > When getQuotaUsage() on a mount point path, the quota part should be the > global quota. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14974) RBF: Make tests use free ports
[ https://issues.apache.org/jira/browse/HDFS-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973608#comment-16973608 ] Íñigo Goiri commented on HDFS-14974: HDFS-14955 failed in TestRBFMetrics because of an issue that would be resolved with [^HDFS-14974.000.patch]. Any concern with the current approach? > RBF: Make tests use free ports > -- > > Key: HDFS-14974 > URL: https://issues.apache.org/jira/browse/HDFS-14974 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-14974.000.patch > > > Currently, {{TestRouterSecurityManager#testCreateCredentials}} create a > Router with the default ports. However, these ports might be used. We should > set it to :0 for it to be assigned dynamically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973606#comment-16973606 ] Íñigo Goiri commented on HDFS-14924: This seems familiar. Can you link the other JIRA? > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
[ https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2470: - Description: Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. This will help in analyzing audit logs for MPU. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "9.134.51.232" hostName: "9.134.51.232" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | was: Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "9.134.51.232" hostName: "9.134.51.232" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | > Add partName, partNumber for CommitMultipartUpload > -- > > Key: HDDS-2470 > URL: https://issues.apache.org/jira/browse/HDDS-2470 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > > Right now when complete Multipart Upload is not printing partName and > partNumber into the audit log. This will help in analyzing audit logs for MPU. > > > 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | > op=COMMIT_MULTIPART_UPLOAD_PARTKEY > {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, > key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, > replicationFactor=ONE, keyLocationInfo=[blockID { > containerBlockID > { containerID: 2 localID: 103129366531867089 } > blockCommitSequenceId: 4978 > } > offset: 0 > length: 5242880 > createVersion: 0 > pipeline { > leaderID: "" > members { > uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" > ipAddress: "9.134.51.232" >
[jira] [Created] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
Bharat Viswanadham created HDDS-2470: Summary: Add partName, partNumber for CommitMultipartUpload Key: HDDS-2470 URL: https://issues.apache.org/jira/browse/HDDS-2470 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "9.134.51.232" hostName: "9.134.51.232" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata
[ https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2469: --- Status: Patch Available (was: Open) > Avoid changing client-side key metadata > --- > > Key: HDDS-2469 > URL: https://issues.apache.org/jira/browse/HDDS-2469 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2469) Avoid changing client-side key metadata
[ https://issues.apache.org/jira/browse/HDDS-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2469: - Labels: pull-request-available (was: ) > Avoid changing client-side key metadata > --- > > Key: HDDS-2469 > URL: https://issues.apache.org/jira/browse/HDDS-2469 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > > Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2469) Avoid changing client-side key metadata
[ https://issues.apache.org/jira/browse/HDDS-2469?focusedWorklogId=342778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342778 ] ASF GitHub Bot logged work on HDDS-2469: Author: ASF GitHub Bot Created on: 13/Nov/19 18:44 Start Date: 13/Nov/19 18:44 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #154: HDDS-2469. Avoid changing client-side key metadata URL: https://github.com/apache/hadoop-ozone/pull/154 ## What changes were proposed in this pull request? Let OM `RpcClient` add extra metadata keys only into the request to OM, not to the input from client. https://issues.apache.org/jira/browse/HDDS-2469 ## How was this patch tested? Tweaked integration tests: ``` Tests run: 69, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 26.154 s - in org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClient Tests run: 70, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 27.976 s - in org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.035 s - in org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClientForAclAuditLog ``` Also ran `ozone` smoketest, which includes GDPR test: ``` ozone-gdpr :: Smoketest Ozone GDPR Feature == Test GDPR disabled| PASS | -- Test GDPR --enforcegdpr=true | PASS | -- Test GDPR -g=true | PASS | -- Test GDPR -g=false| PASS | -- ozone-gdpr :: Smoketest Ozone GDPR Feature| PASS | 4 critical tests, 4 passed, 0 failed 4 tests total, 4 passed, 0 failed ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342778) Remaining Estimate: 0h Time Spent: 10m > Avoid changing client-side key metadata > --- > > Key: HDDS-2469 > URL: https://issues.apache.org/jira/browse/HDDS-2469 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2469) Avoid changing client-side key metadata
Attila Doroszlai created HDDS-2469: -- Summary: Avoid changing client-side key metadata Key: HDDS-2469 URL: https://issues.apache.org/jira/browse/HDDS-2469 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Attila Doroszlai Assignee: Attila Doroszlai Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2467) Allow running Freon validators with limited memory
[ https://issues.apache.org/jira/browse/HDDS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2467: --- Status: Patch Available (was: In Progress) > Allow running Freon validators with limited memory > -- > > Key: HDDS-2467 > URL: https://issues.apache.org/jira/browse/HDDS-2467 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: freon >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Freon validators read each item to be validated completely into a {{byte[]}} > buffer. This allows timing only the read (and buffer allocation), but not > the subsequent digest calculation. However, it also means that memory > required for running the validators is proportional to key size. > I propose to add a command-line flag to allow calculating the digest while > reading the input stream. This changes timing results a bit, since values > will include the time required for digest calculation. On the other hand, > Freon will be able to validate huge keys with limited memory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2458) Avoid list copy in ChecksumData
[ https://issues.apache.org/jira/browse/HDDS-2458?focusedWorklogId=342730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342730 ] ASF GitHub Bot logged work on HDDS-2458: Author: ASF GitHub Bot Created on: 13/Nov/19 17:19 Start Date: 13/Nov/19 17:19 Worklog Time Spent: 10m Work Description: nandakumar131 commented on pull request #141: HDDS-2458. Avoid list copy in ChecksumData URL: https://github.com/apache/hadoop-ozone/pull/141 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342730) Time Spent: 20m (was: 10m) > Avoid list copy in ChecksumData > --- > > Key: HDDS-2458 > URL: https://issues.apache.org/jira/browse/HDDS-2458 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{ChecksumData}} is initially created with empty list of checksums, then it > is updated with computed checksums, copying the list. The computed list can > be set directly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2458) Avoid list copy in ChecksumData
[ https://issues.apache.org/jira/browse/HDDS-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-2458: -- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Avoid list copy in ChecksumData > --- > > Key: HDDS-2458 > URL: https://issues.apache.org/jira/browse/HDDS-2458 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{ChecksumData}} is initially created with empty list of checksums, then it > is updated with computed checksums, copying the list. The computed list can > be set directly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14969) Fix HDFS client unnecessary failover log printing
[ https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973543#comment-16973543 ] Erik Krogen commented on HDFS-14969: I agree that adjusting the logging depending on how many NNs are configured is the better approach. Your proposal seems reasonable to me. > Fix HDFS client unnecessary failover log printing > - > > Key: HDFS-14969 > URL: https://issues.apache.org/jira/browse/HDFS-14969 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > In multi-NameNodes scenario, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2464) Avoid unnecessary allocations for FileChannel.open call
[ https://issues.apache.org/jira/browse/HDDS-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-2464: -- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Avoid unnecessary allocations for FileChannel.open call > --- > > Key: HDDS-2464 > URL: https://issues.apache.org/jira/browse/HDDS-2464 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{ChunkUtils}} calls {{FileChannel#open(Path, OpenOption...)}}. Vararg array > elements are then added to a new {{HashSet}} to call {{FileChannel#open(Path, > Set, FileAttribute...)}}. We can call the latter > directly instead. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2464) Avoid unnecessary allocations for FileChannel.open call
[ https://issues.apache.org/jira/browse/HDDS-2464?focusedWorklogId=342728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342728 ] ASF GitHub Bot logged work on HDDS-2464: Author: ASF GitHub Bot Created on: 13/Nov/19 17:11 Start Date: 13/Nov/19 17:11 Worklog Time Spent: 10m Work Description: nandakumar131 commented on pull request #147: HDDS-2464. Avoid unnecessary allocations for FileChannel.open call URL: https://github.com/apache/hadoop-ozone/pull/147 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 342728) Time Spent: 20m (was: 10m) > Avoid unnecessary allocations for FileChannel.open call > --- > > Key: HDDS-2464 > URL: https://issues.apache.org/jira/browse/HDDS-2464 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {{ChunkUtils}} calls {{FileChannel#open(Path, OpenOption...)}}. Vararg array > elements are then added to a new {{HashSet}} to call {{FileChannel#open(Path, > Set, FileAttribute...)}}. We can call the latter > directly instead. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14924: - Attachment: HDFS-14924.001.patch > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org