[GitHub] [hadoop-ozone] bshashikant commented on pull request #1005: HDDS-3350. Ozone Retry Policy Improvements.
bshashikant commented on pull request #1005: URL: https://github.com/apache/hadoop-ozone/pull/1005#issuecomment-637972830 @lokeshj1703 , can you please update the patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] vinayakumarb commented on pull request #987: HDDS-3678. Remove usage of DFSUtil.addPBProtocol method
vinayakumarb commented on pull request #987: URL: https://github.com/apache/hadoop-ozone/pull/987#issuecomment-637969319 > > some code changes are already suggested in #933 > > I think the only overlap is `shuffle`. I'm fine with reverting that part of my change here to avoid conflict. I think, this version of shuffle is fine. I will update the #933 once this is in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-3477) Disable partial chunk write during flush() call in ozone client by default
[ https://issues.apache.org/jira/browse/HDDS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-3477. --- Resolution: Fixed > Disable partial chunk write during flush() call in ozone client by default > -- > > Key: HDDS-3477 > URL: https://issues.apache.org/jira/browse/HDDS-3477 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: mingchao zhao >Priority: Major > Labels: Triaged, pull-request-available > Fix For: 0.6.0 > > > Currently, Ozone client flushes the partial chunks as well during flush() > call by default. > [https://github.com/apache/hadoop-ozone/pull/716] proposes to add a > configuration to disallow partial chunk flush during flush() call. This Jira > aims to enable the config on by default to mimic the default hdfs flush() > behaviour and fix any failing unit tests associated with the change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bshashikant merged pull request #957: HDDS-3477. Disable partial chunk write during flush() call in ozone client by default.
bshashikant merged pull request #957: URL: https://github.com/apache/hadoop-ozone/pull/957 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bshashikant commented on pull request #957: HDDS-3477. Disable partial chunk write during flush() call in ozone client by default.
bshashikant commented on pull request #957: URL: https://github.com/apache/hadoop-ozone/pull/957#issuecomment-63795 Thanks @captainzmc for the contribution. I have committed this to master branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3678) Remove usage of DFSUtil.addPBProtocol method
[ https://issues.apache.org/jira/browse/HDDS-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124587#comment-17124587 ] Attila Doroszlai commented on HDDS-3678: [~seanlau], I don't think HADOOP-17046 needs to wait for this. Ozone currently depends on Hadoop 3.2.1, and will need explicit change to upgrade to 3.3. > Remove usage of DFSUtil.addPBProtocol method > > > Key: HDDS-3678 > URL: https://issues.apache.org/jira/browse/HDDS-3678 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Akira Ajisaka >Assignee: Attila Doroszlai >Priority: Major > Labels: Triaged, pull-request-available > > Hadoop 3.3.0 upgraded protocol buffers to 3.7.1 and RPC code have been > changed. This change will cause compile failure in Ozone. > Vinayakumar is fixing this in Hadoop-side (HADOOP-17046) but it would be > better for Ozone to avoid the usage of Hadoop {{@Private}} classes to make > Ozone a separate project from Hadoop. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-3693) Switch to PipelineStateManagerV2 and put PipelineFactory in PipelineManager
[ https://issues.apache.org/jira/browse/HDDS-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Cheng resolved HDDS-3693. Release Note: PR is merged Resolution: Fixed > Switch to PipelineStateManagerV2 and put PipelineFactory in PipelineManager > --- > > Key: HDDS-3693 > URL: https://issues.apache.org/jira/browse/HDDS-3693 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Li Cheng >Assignee: Li Cheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1007: HDDS-3693 Switch to new StateManager interface.
timmylicheng commented on pull request #1007: URL: https://github.com/apache/hadoop-ozone/pull/1007#issuecomment-637932167 Thanks @xiaoyuyao for review. I will merge it in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng merged pull request #1007: HDDS-3693 Switch to new StateManager interface.
timmylicheng merged pull request #1007: URL: https://github.com/apache/hadoop-ozone/pull/1007 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3667) If we gracefully stop datanode it would be better to notify scm and recon to unregister
[ https://issues.apache.org/jira/browse/HDDS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDDS-3667: - Description: if you execute `bin/ozone --daemon stop datanode`, datanode would be better if it handle the signal and send the unregister request to scm, then scm remove this datanode from scm and reply to the datanode to die. It would be better if you provide a admin cli tool to support remove datanode manually. {code:bash} ozone admin datanode remove {code} was: if you execute `bin/ozone --daemon stop datanode`, datanode would be better if it handle the signal and send the unregister request to scm, then scm remove this datanode from scm and reply to the datanode to die. It would be better if you provide a admin cli tool to support remove datanode manually. {code:shell} ozone admin datanode remove {code} > If we gracefully stop datanode it would be better to notify scm and recon to > unregister > --- > > Key: HDDS-3667 > URL: https://issues.apache.org/jira/browse/HDDS-3667 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Datanode, Ozone Recon, SCM >Affects Versions: 0.6.0 >Reporter: maobaolong >Assignee: Lisheng Sun >Priority: Minor > > if you execute `bin/ozone --daemon stop datanode`, datanode would be better > if it handle the signal and send the unregister request to scm, then scm > remove this datanode from scm and reply to the datanode to die. > It would be better if you provide a admin cli tool to support remove datanode > manually. > {code:bash} > ozone admin datanode remove > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2562) Handle InterruptedException in DatanodeStateMachine
[ https://issues.apache.org/jira/browse/HDDS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124560#comment-17124560 ] Dinesh Chitlangia commented on HDDS-2562: - Reopened as the PR was reverted > Handle InterruptedException in DatanodeStateMachine > --- > > Key: HDDS-2562 > URL: https://issues.apache.org/jira/browse/HDDS-2562 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Major > Labels: newbie, sonar > > Fix 2 instances: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRv=AW5md-7fKcVY8lQ4ZsRv] > > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRx=AW5md-7fKcVY8lQ4ZsRx] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2562) Handle InterruptedException in DatanodeStateMachine
[ https://issues.apache.org/jira/browse/HDDS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia updated HDDS-2562: Fix Version/s: (was: 0.6.0) > Handle InterruptedException in DatanodeStateMachine > --- > > Key: HDDS-2562 > URL: https://issues.apache.org/jira/browse/HDDS-2562 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Major > Labels: newbie, pull-request-available, sonar > > Fix 2 instances: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRv=AW5md-7fKcVY8lQ4ZsRv] > > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRx=AW5md-7fKcVY8lQ4ZsRx] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2562) Handle InterruptedException in DatanodeStateMachine
[ https://issues.apache.org/jira/browse/HDDS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia updated HDDS-2562: Labels: newbie sonar (was: newbie pull-request-available sonar) > Handle InterruptedException in DatanodeStateMachine > --- > > Key: HDDS-2562 > URL: https://issues.apache.org/jira/browse/HDDS-2562 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Major > Labels: newbie, sonar > > Fix 2 instances: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRv=AW5md-7fKcVY8lQ4ZsRv] > > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRx=AW5md-7fKcVY8lQ4ZsRx] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-2562) Handle InterruptedException in DatanodeStateMachine
[ https://issues.apache.org/jira/browse/HDDS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia reopened HDDS-2562: - > Handle InterruptedException in DatanodeStateMachine > --- > > Key: HDDS-2562 > URL: https://issues.apache.org/jira/browse/HDDS-2562 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Major > Labels: newbie, pull-request-available, sonar > Fix For: 0.6.0 > > > Fix 2 instances: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRv=AW5md-7fKcVY8lQ4ZsRv] > > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-7fKcVY8lQ4ZsRx=AW5md-7fKcVY8lQ4ZsRx] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] dineshchitlangia merged pull request #1011: Revert "HDDS-2562. Handle InterruptedException in DatanodeStateMachine"
dineshchitlangia merged pull request #1011: URL: https://github.com/apache/hadoop-ozone/pull/1011 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3567) Make replicationfactor can be configurable to any number
[ https://issues.apache.org/jira/browse/HDDS-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDDS-3567: - Description: As there is a feature request that Ozone should support any replication number of a file, we have the following subtask to do. The following is a simple design document. https://docs.google.com/document/d/1JFjfTb21qqaIVr18FgYNwMEEKBeuklZCRHN7TTq_nBo/edit?usp=sharing https://docs.qq.com/doc/DV2N6bWdCcnJVc3Rk was: As there is a feature request that Ozone should support any replication number of a file, we have the following subtask to do. The following is a simple design document. https://docs.qq.com/doc/DV2N6bWdCcnJVc3Rk > Make replicationfactor can be configurable to any number > > > Key: HDDS-3567 > URL: https://issues.apache.org/jira/browse/HDDS-3567 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: om, Ozone CLI, Ozone Datanode, Ozone Manager, SCM >Affects Versions: 0.6.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > > As there is a feature request that Ozone should support any replication > number of a file, we have the following subtask to do. > The following is a simple design document. > https://docs.google.com/document/d/1JFjfTb21qqaIVr18FgYNwMEEKBeuklZCRHN7TTq_nBo/edit?usp=sharing > https://docs.qq.com/doc/DV2N6bWdCcnJVc3Rk -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] captainzmc commented on pull request #814: HDDS-3286. BasicOzoneFileSystem support batchDelete.
captainzmc commented on pull request #814: URL: https://github.com/apache/hadoop-ozone/pull/814#issuecomment-637919589 Fixed issues with large lock granularity. PR can continue to be reviewed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] captainzmc commented on pull request #957: HDDS-3477. Disable partial chunk write during flush() call in ozone client by default.
captainzmc commented on pull request #957: URL: https://github.com/apache/hadoop-ozone/pull/957#issuecomment-637919086 Had set flush config to false in the original UT. And rewrite the same test with the flush configuration set to true. cc @bshashikant This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1004: HDDS-3639. Maintain FileHandle Information in OMMetadataManager.
prashantpogde commented on a change in pull request #1004: URL: https://github.com/apache/hadoop-ozone/pull/1004#discussion_r434276036 ## File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMKeyRenameResponse.java ## @@ -88,11 +88,22 @@ public void addToDBBatch(OMMetadataManager omMetadataManager, omMetadataManager.getOzoneKey(volumeName, bucketName, fromKeyName)); } else if (createToKeyAndDeleteFromKey()) { // If both from and toKeyName are equal do nothing - omMetadataManager.getKeyTable().deleteWithBatch(batchOperation, - omMetadataManager.getOzoneKey(volumeName, bucketName, fromKeyName)); - omMetadataManager.getKeyTable().putWithBatch(batchOperation, - omMetadataManager.getOzoneKey(volumeName, bucketName, toKeyName), - newKeyInfo); + if (!toKeyName.equals(fromKeyName)) { +omMetadataManager.getKeyTable().deleteWithBatch(batchOperation, +omMetadataManager.getOzoneKey(volumeName, bucketName, fromKeyName)); +omMetadataManager.getKeyTable().putWithBatch(batchOperation, +omMetadataManager.getOzoneKey(volumeName, bucketName, toKeyName), +newKeyInfo); +// At this point we can also update the KeyIdTable. +if (newKeyInfo.getFileHandleInfo() != 0) { Review comment: This means we never created the file handle for this key. The check is there to take care of this case. Eventually we should have file handle for all keys. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1004: HDDS-3639. Maintain FileHandle Information in OMMetadataManager.
prashantpogde commented on a change in pull request #1004: URL: https://github.com/apache/hadoop-ozone/pull/1004#discussion_r434276194 ## File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMKeyRenameResponse.java ## @@ -88,11 +88,22 @@ public void addToDBBatch(OMMetadataManager omMetadataManager, omMetadataManager.getOzoneKey(volumeName, bucketName, fromKeyName)); } else if (createToKeyAndDeleteFromKey()) { // If both from and toKeyName are equal do nothing - omMetadataManager.getKeyTable().deleteWithBatch(batchOperation, - omMetadataManager.getOzoneKey(volumeName, bucketName, fromKeyName)); - omMetadataManager.getKeyTable().putWithBatch(batchOperation, - omMetadataManager.getOzoneKey(volumeName, bucketName, toKeyName), - newKeyInfo); + if (!toKeyName.equals(fromKeyName)) { Review comment: Yes, the check is there in createToKeyAndDeleteFromKey() and it is redundant. I will remove the check from here and upload again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3658) Stop to persist container related pipeline info of each key into OM DB to reduce DB size
[ https://issues.apache.org/jira/browse/HDDS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-3658: - Labels: pull-request-available (was: ) > Stop to persist container related pipeline info of each key into OM DB to > reduce DB size > > > Key: HDDS-3658 > URL: https://issues.apache.org/jira/browse/HDDS-3658 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > Labels: pull-request-available > > An investigation result of serilized key size, RATIS with three replica. > Following examples are quoted from the output of the "ozone sh key info" > command which doesn't show related pipeline information for each key location > element. > 1. empty key, serilized size 113 bytes > hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/_SUCCESS", > "dataSize" : 0, > "creationTime" : "2019-11-21T13:53:11.330Z", > "modificationTime" : "2019-11-21T13:53:11.361Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 2. key with one chunk data, serilized size 661 bytes > hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-6/part-m-00037", > "dataSize" : 223696200, > "creationTime" : "2019-11-18T07:47:58.254Z", > "modificationTime" : "2019-11-18T07:53:52.066Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 7, > "localID" : 103157811003588713, > "length" : 223696200, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 3. key with two chunk data, serilized size 1205 bytes, > ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/part-m-00027", > "dataSize" : 223696200, > "creationTime" : "2019-11-21T13:47:07.653Z", > "modificationTime" : "2019-11-21T13:53:07.964Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 221, > "localID" : 103176210196201501, > "length" : 134217728, > "offset" : 0 > }, { > "containerID" : 222, > "localID" : 103176231767375926, > "length" : 89478472, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > When client reads a key, there is "refreshPipeline" option to control whether > to get the up-to-date container location infofrom SCM. > Currently, this option is always set to true, which makes saved container > location info in OM DB useless. > Another motivation is when using Nanda's tool for the OM performance test, > with 1000 millions(1Billion) keys, each key with 1 replica, 2 chunk meta > data, the total rocks DB directory size is 65.5GB. One of our customer > cluster has the requirement to save 10 Billion objects. In this case ,the DB > size is approximately (65.5GB * 10 * /2 * 3 )~ 1TB. > The goal of this task is going to discard the container location info when > persist key to OM DB to save the DB space. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] ChenSammi opened a new pull request #1012: HDDS-3658. Stop to persist container related pipeline info of each ke…
ChenSammi opened a new pull request #1012: URL: https://github.com/apache/hadoop-ozone/pull/1012 https://issues.apache.org/jira/browse/HDDS-3658 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-3428) Enable TestOzoneRpcClientWithRatis test cases
[ https://issues.apache.org/jira/browse/HDDS-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Pogde reassigned HDDS-3428: Assignee: Prashant Pogde > Enable TestOzoneRpcClientWithRatis test cases > - > > Key: HDDS-3428 > URL: https://issues.apache.org/jira/browse/HDDS-3428 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 0.5.0 >Reporter: Nanda kumar >Assignee: Prashant Pogde >Priority: Major > > Fix and enable TestOzoneRpcClientWithRatis test cases -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3658) Stop to persist container related pipeline info of each key into OM DB to reduce DB size
[ https://issues.apache.org/jira/browse/HDDS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen updated HDDS-3658: - Summary: Stop to persist container related pipeline info of each key into OM DB to reduce DB size (was: Stop persist container related pipeline info of each key into OM DB to reduce DB size) > Stop to persist container related pipeline info of each key into OM DB to > reduce DB size > > > Key: HDDS-3658 > URL: https://issues.apache.org/jira/browse/HDDS-3658 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > > An investigation result of serilized key size, RATIS with three replica. > Following examples are quoted from the output of the "ozone sh key info" > command which doesn't show related pipeline information for each key location > element. > 1. empty key, serilized size 113 bytes > hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/_SUCCESS", > "dataSize" : 0, > "creationTime" : "2019-11-21T13:53:11.330Z", > "modificationTime" : "2019-11-21T13:53:11.361Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 2. key with one chunk data, serilized size 661 bytes > hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-6/part-m-00037", > "dataSize" : 223696200, > "creationTime" : "2019-11-18T07:47:58.254Z", > "modificationTime" : "2019-11-18T07:53:52.066Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 7, > "localID" : 103157811003588713, > "length" : 223696200, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 3. key with two chunk data, serilized size 1205 bytes, > ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/part-m-00027", > "dataSize" : 223696200, > "creationTime" : "2019-11-21T13:47:07.653Z", > "modificationTime" : "2019-11-21T13:53:07.964Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 221, > "localID" : 103176210196201501, > "length" : 134217728, > "offset" : 0 > }, { > "containerID" : 222, > "localID" : 103176231767375926, > "length" : 89478472, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > When client reads a key, there is "refreshPipeline" option to control whether > to get the up-to-date container location infofrom SCM. > Currently, this option is always set to true, which makes saved container > location info in OM DB useless. > Another motivation is when using Nanda's tool for the OM performance test, > with 1000 millions(1Billion) keys, each key with 1 replica, 2 chunk meta > data, the total rocks DB directory size is 65.5GB. One of our customer > cluster has the requirement to save 10 Billion objects. In this case ,the DB > size is approximately (65.5GB * 10 * /2 * 3 )~ 1TB. > The goal of this task is going to discard the container location info when > persist key to OM DB to save the DB space. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3658) Stop persist container related pipeline info of each key into OM DB to reduce DB size
[ https://issues.apache.org/jira/browse/HDDS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen updated HDDS-3658: - Summary: Stop persist container related pipeline info of each key into OM DB to reduce DB size (was: Remove container location information when persist key info into OM DB to reduce meta data db size) > Stop persist container related pipeline info of each key into OM DB to reduce > DB size > - > > Key: HDDS-3658 > URL: https://issues.apache.org/jira/browse/HDDS-3658 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > > An investigation result of serilized key size, RATIS with three replica. > Following examples are quoted from the output of the "ozone sh key info" > command which doesn't show related pipeline information for each key location > element. > 1. empty key, serilized size 113 bytes > hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/_SUCCESS", > "dataSize" : 0, > "creationTime" : "2019-11-21T13:53:11.330Z", > "modificationTime" : "2019-11-21T13:53:11.361Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 2. key with one chunk data, serilized size 661 bytes > hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-6/part-m-00037", > "dataSize" : 223696200, > "creationTime" : "2019-11-18T07:47:58.254Z", > "modificationTime" : "2019-11-18T07:53:52.066Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 7, > "localID" : 103157811003588713, > "length" : 223696200, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 3. key with two chunk data, serilized size 1205 bytes, > ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/part-m-00027", > "dataSize" : 223696200, > "creationTime" : "2019-11-21T13:47:07.653Z", > "modificationTime" : "2019-11-21T13:53:07.964Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 221, > "localID" : 103176210196201501, > "length" : 134217728, > "offset" : 0 > }, { > "containerID" : 222, > "localID" : 103176231767375926, > "length" : 89478472, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > When client reads a key, there is "refreshPipeline" option to control whether > to get the up-to-date container location infofrom SCM. > Currently, this option is always set to true, which makes saved container > location info in OM DB useless. > Another motivation is when using Nanda's tool for the OM performance test, > with 1000 millions(1Billion) keys, each key with 1 replica, 2 chunk meta > data, the total rocks DB directory size is 65.5GB. One of our customer > cluster has the requirement to save 10 Billion objects. In this case ,the DB > size is approximately (65.5GB * 10 * /2 * 3 )~ 1TB. > The goal of this task is going to discard the container location info when > persist key to OM DB to save the DB space. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3658) Remove container location information when persist key info into OM DB to reduce meta data db size
[ https://issues.apache.org/jira/browse/HDDS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen updated HDDS-3658: - Description: An investigation result of serilized key size, RATIS with three replica. Following examples are quoted from the output of the "ozone sh key info" command which doesn't show related pipeline information for each key location element. 1. empty key, serilized size 113 bytes hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-7/_SUCCESS", "dataSize" : 0, "creationTime" : "2019-11-21T13:53:11.330Z", "modificationTime" : "2019-11-21T13:53:11.361Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ ], "metadata" : { }, "fileEncryptionInfo" : null } 2. key with one chunk data, serilized size 661 bytes hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-6/part-m-00037", "dataSize" : 223696200, "creationTime" : "2019-11-18T07:47:58.254Z", "modificationTime" : "2019-11-18T07:53:52.066Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ { "containerID" : 7, "localID" : 103157811003588713, "length" : 223696200, "offset" : 0 } ], "metadata" : { }, "fileEncryptionInfo" : null } 3. key with two chunk data, serilized size 1205 bytes, ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-7/part-m-00027", "dataSize" : 223696200, "creationTime" : "2019-11-21T13:47:07.653Z", "modificationTime" : "2019-11-21T13:53:07.964Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ { "containerID" : 221, "localID" : 103176210196201501, "length" : 134217728, "offset" : 0 }, { "containerID" : 222, "localID" : 103176231767375926, "length" : 89478472, "offset" : 0 } ], "metadata" : { }, "fileEncryptionInfo" : null } When client reads a key, there is "refreshPipeline" option to control whether to get the up-to-date container location infofrom SCM. Currently, this option is always set to true, which makes saved container location info in OM DB useless. Another motivation is when using Nanda's tool for the OM performance test, with 1000 millions(1Billion) keys, each key with 1 replica, 2 chunk meta data, the total rocks DB directory size is 65.5GB. One of our customer cluster has the requirement to save 10 Billion objects. In this case ,the DB size is approximately (65.5GB * 10 * /2 * 3 )~ 1TB. The goal of this task is going to discard the container location info when persist key to OM DB to save the DB space. was: An investigation result of serilized key size, RATIS with three replica. 1. empty key, serilized size 113 bytes hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-7/_SUCCESS", "dataSize" : 0, "creationTime" : "2019-11-21T13:53:11.330Z", "modificationTime" : "2019-11-21T13:53:11.361Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ ], "metadata" : { }, "fileEncryptionInfo" : null } 2. key with one chunk data, serilized size 661 bytes hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-6/part-m-00037", "dataSize" : 223696200, "creationTime" : "2019-11-18T07:47:58.254Z", "modificationTime" : "2019-11-18T07:53:52.066Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ { "containerID" : 7, "localID" : 103157811003588713, "length" : 223696200, "offset" : 0 } ], "metadata" : { }, "fileEncryptionInfo" : null } 3. key with two chunk data, serilized size 1205 bytes, ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 { "volumeName" : "hadoop", "bucketName" : "bucket", "name" : "user/root/terasort/10G-input-7/part-m-00027", "dataSize" : 223696200, "creationTime" : "2019-11-21T13:47:07.653Z", "modificationTime" : "2019-11-21T13:53:07.964Z", "replicationType" : "RATIS", "replicationFactor" : 3, "ozoneKeyLocations" : [ { "containerID" : 221, "localID" : 103176210196201501, "length" : 134217728, "offset" : 0 }, { "containerID" : 222, "localID" : 103176231767375926, "length" : 89478472, "offset" : 0 } ], "metadata" : { }, "fileEncryptionInfo" : null } When client reads a key, there is "refreshPipeline" option to control whether to get the up-to-date container location infofrom SCM. Currently, this option is always
[jira] [Commented] (HDDS-3658) Remove container location information when persist key info into OM DB to reduce meta data db size
[ https://issues.apache.org/jira/browse/HDDS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124522#comment-17124522 ] Sammi Chen commented on HDDS-3658: -- Hi [~elek], the example is copied from the "ozone sh key info" command output. It actually will not show the pipeline information for each key location so fart. I will add more description to clarify the goal which you understand 100% correctly. > Remove container location information when persist key info into OM DB to > reduce meta data db size > -- > > Key: HDDS-3658 > URL: https://issues.apache.org/jira/browse/HDDS-3658 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > > An investigation result of serilized key size, RATIS with three replica. > 1. empty key, serilized size 113 bytes > hadoop/bucket/user/root/terasort/10G-input-7/_SUCCESS > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/_SUCCESS", > "dataSize" : 0, > "creationTime" : "2019-11-21T13:53:11.330Z", > "modificationTime" : "2019-11-21T13:53:11.361Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 2. key with one chunk data, serilized size 661 bytes > hadoop/bucket/user/root/terasort/10G-input-6/part-m-00037 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-6/part-m-00037", > "dataSize" : 223696200, > "creationTime" : "2019-11-18T07:47:58.254Z", > "modificationTime" : "2019-11-18T07:53:52.066Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 7, > "localID" : 103157811003588713, > "length" : 223696200, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > 3. key with two chunk data, serilized size 1205 bytes, > ozone sh key info hadoop/bucket/user/root/terasort/10G-input-7/part-m-00027 > { > "volumeName" : "hadoop", > "bucketName" : "bucket", > "name" : "user/root/terasort/10G-input-7/part-m-00027", > "dataSize" : 223696200, > "creationTime" : "2019-11-21T13:47:07.653Z", > "modificationTime" : "2019-11-21T13:53:07.964Z", > "replicationType" : "RATIS", > "replicationFactor" : 3, > "ozoneKeyLocations" : [ { > "containerID" : 221, > "localID" : 103176210196201501, > "length" : 134217728, > "offset" : 0 > }, { > "containerID" : 222, > "localID" : 103176231767375926, > "length" : 89478472, > "offset" : 0 > } ], > "metadata" : { }, > "fileEncryptionInfo" : null > } > When client reads a key, there is "refreshPipeline" option to control whether > to get the up-to-date container location infofrom SCM. > Currently, this option is always set to true, which makes saved container > location info in OM DB useless. > Another motivation is when using Nanda's tool for the OM performance test, > with 1000 millions(1Billion) keys, each key with 1 replica, 2 chunk meta > data, the total rocks DB directory size is 65.5GB. One of our customer > cluster has the requirement to save 10 Billion objects. In this case ,the DB > size is approximately (65.5GB * 10 * /2 * 3 )~ 1TB. > The goal of this task is going to discard the container location info when > persist key to OM DB to save the DB space. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sonarcloud[bot] commented on pull request #1011: Revert "HDDS-2562. Handle InterruptedException in DatanodeStateMachine"
sonarcloud[bot] commented on pull request #1011: URL: https://github.com/apache/hadoop-ozone/pull/1011#issuecomment-637911957 SonarCloud Quality Gate failed. [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=BUG) [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=BUG) [2 Bugs](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=BUG) [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=VULNERABILITY) [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=VULNERABILITY) (and [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=SECURITY_HOTSPOT) to review) [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=CODE_SMELL) [](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=CODE_SMELL) [3 Code Smells](https://sonarcloud.io/project/issues?id=hadoop-ozone=1011=false=CODE_SMELL) [](https://sonarcloud.io/component_measures?id=hadoop-ozone=1011=new_coverage=list) [0.0% Coverage](https://sonarcloud.io/component_measures?id=hadoop-ozone=1011=new_coverage=list) [](https://sonarcloud.io/component_measures?id=hadoop-ozone=1011=new_duplicated_lines_density=list) [0.0% Duplication](https://sonarcloud.io/component_measures?id=hadoop-ozone=1011=new_duplicated_lines_density=list) The version of Java (1.8.0_232) you have used to run this analysis is deprecated and we will stop accepting it from October 2020. Please update to at least Java 11. Read more [here](https://sonarcloud.io/documentation/upcoming/) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] ChenSammi commented on pull request #976: HDDS-3672. Ozone fs failed to list intermediate directory.
ChenSammi commented on pull request #976: URL: https://github.com/apache/hadoop-ozone/pull/976#issuecomment-637911979 Thanks @elek and @rakeshadr for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] dineshchitlangia opened a new pull request #1011: Revert "HDDS-2562. Handle InterruptedException in DatanodeStateMachine"
dineshchitlangia opened a new pull request #1011: URL: https://github.com/apache/hadoop-ozone/pull/1011 Reverts apache/hadoop-ozone#969 After this change was merged, acceptance and integration tests are getting timed out. Reverting this to ensure other PR are not blocked by this while I conduct more investigation. Apologies for the inconvenience. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3678) Remove usage of DFSUtil.addPBProtocol method
[ https://issues.apache.org/jira/browse/HDDS-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124500#comment-17124500 ] liusheng commented on HDDS-3678: Hi, Does the HADOOP-17046 need to wait this merged firstly ? > Remove usage of DFSUtil.addPBProtocol method > > > Key: HDDS-3678 > URL: https://issues.apache.org/jira/browse/HDDS-3678 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: build >Reporter: Akira Ajisaka >Assignee: Attila Doroszlai >Priority: Major > Labels: Triaged, pull-request-available > > Hadoop 3.3.0 upgraded protocol buffers to 3.7.1 and RPC code have been > changed. This change will cause compile failure in Ozone. > Vinayakumar is fixing this in Hadoop-side (HADOOP-17046) but it would be > better for Ozone to avoid the usage of Hadoop {{@Private}} classes to make > Ozone a separate project from Hadoop. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] iamabug commented on pull request #870: HDDS-2765. security/SecureOzone.md translation
iamabug commented on pull request #870: URL: https://github.com/apache/hadoop-ozone/pull/870#issuecomment-637889210 > Thanks @iamabug for the translation. > Overall LGTM, a minor title could be consistent with the latest release. Sorry for taking so long, a new commit has been made This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3350) Ozone Retry Policy Improvements
[ https://issues.apache.org/jira/browse/HDDS-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3350: --- Priority: Blocker (was: Major) > Ozone Retry Policy Improvements > --- > > Key: HDDS-3350 > URL: https://issues.apache.org/jira/browse/HDDS-3350 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Labels: Triaged, pull-request-available > Attachments: Retry Behaviour in Ozone Client.pdf, Retry Behaviour in > Ozone Client_Updated.pdf, Retry Behaviour in Ozone Client_Updated_2.pdf, > Retry Policy Results - Teragen 100GB.pdf > > > Currently any ozone client request can spend a huge amount of time in retries > and ozone client can retry its requests very aggressively. The waiting time > can thus be very high before a client request fails. Further aggressive > retries by ratis client used by ozone can bog down a ratis pipeline leader. > The Jira aims to make changes to the current retry behavior in Ozone client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3402) Use proper acls for sub directories created during CreateDirectory operation
[ https://issues.apache.org/jira/browse/HDDS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124463#comment-17124463 ] Jitendra Nath Pandey commented on HDDS-3402: cc [~xyao] > Use proper acls for sub directories created during CreateDirectory operation > > > Key: HDDS-3402 > URL: https://issues.apache.org/jira/browse/HDDS-3402 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Bharat Viswanadham >Assignee: Rakesh Radhakrishnan >Priority: Blocker > Labels: TriagePending > > Use proper ACLS for subdirectories created during create directory operation. > All subdirectories/missing directories should inherit the ACLS from the > bucket if ancestors are not present in key table. If present should inherit > the ACLS from its ancestor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3402) Use proper acls for sub directories created during CreateDirectory operation
[ https://issues.apache.org/jira/browse/HDDS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3402: --- Priority: Blocker (was: Major) > Use proper acls for sub directories created during CreateDirectory operation > > > Key: HDDS-3402 > URL: https://issues.apache.org/jira/browse/HDDS-3402 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Bharat Viswanadham >Assignee: Rakesh Radhakrishnan >Priority: Blocker > Labels: TriagePending > > Use proper ACLS for subdirectories created during create directory operation. > All subdirectories/missing directories should inherit the ACLS from the > bucket if ancestors are not present in key table. If present should inherit > the ACLS from its ancestor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #990: HDDS-3653. Add documentation for Copy key command
xiaoyuyao commented on a change in pull request #990: URL: https://github.com/apache/hadoop-ozone/pull/990#discussion_r434231116 ## File path: hadoop-hdds/docs/content/shell/KeyCommands.md ## @@ -155,3 +156,22 @@ The `key cat` command displays the contents of a specific Ozone key to standard ozone sh key cat /hive/jan/hello.txt {{< /highlight >}} Displays the contents of the key hello.txt from the _/hive/jan_ bucket to standard output. + +### Cp + +The `key cp` command copies a key to another one in the specified bucket. + +***Params:*** + +| Arguments | Comment| +||-| +| Uri | The name of the bucket in **/volume/bucket** format. +| FromKey | The existing key to be copied +| ToKey | The name of the new key +| -r, \-\-replication| Optional, Number of copies, ONE or THREE are the options. Picks up the default from cluster configuration. +| -t, \-\-type | Optional, replication type of the new key. RATIS and STAND_ALONE are the options. Picks up the default from cluster configuration. + +{{< highlight bash >}} +ozone sh key cp /hive/jan sales.orc new_one.orc Review comment: The parameter is a bit different from hadoop cp comamnd. My understanding is that the uri is to help the OM HA but seems here we are restricted to under the same URI. Does it take full uri as from and to key parameters? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3436) Enable TestOzoneContainerRatis test cases
[ https://issues.apache.org/jira/browse/HDDS-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3436: --- Priority: Critical (was: Major) > Enable TestOzoneContainerRatis test cases > - > > Key: HDDS-3436 > URL: https://issues.apache.org/jira/browse/HDDS-3436 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 0.5.0 >Reporter: Nanda kumar >Priority: Critical > > Fix and enable TestOzoneContainerRatis test cases -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3439) Enable TestSecureContainerServer test cases
[ https://issues.apache.org/jira/browse/HDDS-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3439: --- Priority: Blocker (was: Major) > Enable TestSecureContainerServer test cases > --- > > Key: HDDS-3439 > URL: https://issues.apache.org/jira/browse/HDDS-3439 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 0.5.0 >Reporter: Nanda kumar >Priority: Blocker > > Fix and enable TestSecureContainerServer test cases -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #992: HDDS-3627. Remove FilteredClassloader and replace with maven based hadoop2/hadoop3 ozonefs generation
xiaoyuyao commented on pull request #992: URL: https://github.com/apache/hadoop-ozone/pull/992#issuecomment-637860876 @elek Agree. If you could check and confirm the CI clean, I think we should merge this one given the size of the patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3458) Support Hadoop 2.x with build-time classpath separation instead of isolated classloader
[ https://issues.apache.org/jira/browse/HDDS-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3458: --- Priority: Blocker (was: Major) > Support Hadoop 2.x with build-time classpath separation instead of isolated > classloader > --- > > Key: HDDS-3458 > URL: https://issues.apache.org/jira/browse/HDDS-3458 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Blocker > Labels: Triaged > Attachments: classpath.pdf > > > Apache Hadoop Ozone is a Hadoop subproject. It depends on the released Hadoop > 3.2. But as Hadoop 3.2 is very rare in production, older versions should be > supported to make it possible to work together with Spark, Hive, HBase and > older clusters. > Our current approach is using classloader based separation (ozonefs "legacy" > jar), which has multiple problems: > 1. It's quite complex and hard to debug > 2. It couldn't work together with security > The issue proposes to use a different approach > 1. Reduce the dependency on Hadoop (including the replacement of hadoop > metrics and cleanup of the usage of configuration) > 2. Create multiple version from ozonefs-client with different compile time > dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3476) Use persisted transaction info during OM startup in OM StateMachine
[ https://issues.apache.org/jira/browse/HDDS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3476: --- Priority: Critical (was: Major) > Use persisted transaction info during OM startup in OM StateMachine > --- > > Key: HDDS-3476 > URL: https://issues.apache.org/jira/browse/HDDS-3476 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Critical > Labels: Triaged, pull-request-available > > HDDS-3475 persisted transaction info into DB. This Jira is to use > transactionInfo persisted to DB during OM startup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3481) SCM ask 31 datanodes to replicate the same container
[ https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3481: --- Priority: Critical (was: Major) > SCM ask 31 datanodes to replicate the same container > > > Key: HDDS-3481 > URL: https://issues.apache.org/jira/browse/HDDS-3481 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Critical > Labels: TriagePending > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > > *What's the problem ?* > As the image shows, scm ask 31 datanodes to replicate container 2037 every > 10 minutes from 2020-04-17 23:38:51. And at 2020-04-18 08:58:52 scm find the > replicate num of container 2037 is 12, then it ask 11 datanodes to delete > container 2037. > !screenshot-1.png! > !screenshot-2.png! > *What's the reason ?* > scm check whether (container replicates num + > inflightReplication.get(containerId).size() - > inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it > will ask some datanode to replicate the container, and add the action into > inflightReplication.get(containerId). The replicate action time out is 10 > minutes, if action timeout, scm will delete the action from > inflightReplication.get(containerId) as the image shows. Then (container > replicates num + inflightReplication.get(containerId).size() - > inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask > another datanode to replicate the container. > Because replicate container cost a long time, sometimes it cannot finish in > 10 minutes, thus 31 datanodes has to replicate the container every 10 > minutes. 19 of 31 datanodes replicate container from the same source > datanode, it will also cause big pressure on the source datanode and > replicate container become slower. Actually it cost 4 hours to finish the > first replicate. > !screenshot-4.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-3508) container replicas are replicated to all available datanodes
[ https://issues.apache.org/jira/browse/HDDS-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124309#comment-17124309 ] Jitendra Nath Pandey edited comment on HDDS-3508 at 6/2/20, 11:24 PM: -- Duplicate of HDDS-3481? was (Author: arpitagarwal): Duplicate of HDFS-3481? > container replicas are replicated to all available datanodes > > > Key: HDDS-3508 > URL: https://issues.apache.org/jira/browse/HDDS-3508 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Nilotpal Nandi >Assignee: Nanda kumar >Priority: Major > Labels: TriagePending > > steps taken : > --- > 1. Write data > 2. Deleted hdds datanode dir from one of the container replica node (this > node is the leader). > 3. Wait for few hours. > 4. Stopped datanode where hdds datanode dir was deleted. > > Container got replicated on all available DNs > {noformat} > ozone admin container info 25 | egrep 'Container|Datanodes' > Wed Apr 29 07:28:13 UTC 2020 > Container id: 25 > Container State: CLOSED > Datanodes: > [quasar-xotthq-6.quasar-xotthq.root.hwx.site,quasar-xotthq-7.quasar-xotthq.root.hwx.site,quasar-xotthq-4.quasar-xotthq.root.hwx.site,quasar-xotthq-8.quasar-xotthq.root.hwx.site,quasar-xotthq-3.quasar-xotthq.root.hwx.site,quasar-xotthq-5.quasar-xotthq.root.hwx.site,quasar-xotthq-2.quasar-xotthq.root.hwx.site]{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3512) s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* jar
[ https://issues.apache.org/jira/browse/HDDS-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3512: --- Priority: Blocker (was: Major) > s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* > jar > - > > Key: HDDS-3512 > URL: https://issues.apache.org/jira/browse/HDDS-3512 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: s3g >Reporter: Sammi Chen >Priority: Blocker > Labels: TriagePending > > The default multi-part size is 5MB, which is 5242880 byte, while all the > chunks saved by s3g is 5246566 byte which is greater than 5MB. > By looking into the ObjectEndpoint.java, it seems the chunk size is retrieved > from the "Content-Length" header. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3554) Multipart Upload Failed because partName mismatch
[ https://issues.apache.org/jira/browse/HDDS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3554: --- Priority: Critical (was: Major) > Multipart Upload Failed because partName mismatch > - > > Key: HDDS-3554 > URL: https://issues.apache.org/jira/browse/HDDS-3554 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: s3g >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Critical > Labels: TriagePending > Attachments: screenshot-1.png > > > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3562) Datanodes should send ICR when a container replica deletion is successful
[ https://issues.apache.org/jira/browse/HDDS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3562: --- Priority: Blocker (was: Major) > Datanodes should send ICR when a container replica deletion is successful > - > > Key: HDDS-3562 > URL: https://issues.apache.org/jira/browse/HDDS-3562 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Blocker > Labels: Triaged, pull-request-available > > Whenever a datanode executes the delete container command and deletes the > container replica, it has to immediately send an ICR to update the container > replica state in SCM. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3554) Multipart Upload Failed because partName mismatch
[ https://issues.apache.org/jira/browse/HDDS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124457#comment-17124457 ] Jitendra Nath Pandey commented on HDDS-3554: cc [~bharat] [~elek] > Multipart Upload Failed because partName mismatch > - > > Key: HDDS-3554 > URL: https://issues.apache.org/jira/browse/HDDS-3554 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: s3g >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Critical > Labels: TriagePending > Attachments: screenshot-1.png > > > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3594) ManagedChannels are leaked in XceiverClientGrpc manager
[ https://issues.apache.org/jira/browse/HDDS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124455#comment-17124455 ] Jitendra Nath Pandey commented on HDDS-3594: Is it a duplicate of HDDS-3600 ? > ManagedChannels are leaked in XceiverClientGrpc manager > --- > > Key: HDDS-3594 > URL: https://issues.apache.org/jira/browse/HDDS-3594 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.6.0 >Reporter: Rakesh Radhakrishnan >Priority: Major > Labels: TriagePending > > XceiverClientGrpc#ManagedChannels are leaked when running {{Hadoop Synthetic > Load Generator}} pointing to OzoneFS. > *Stacktrace:* > {code:java} > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=99, target=10.17.248.31:9859} > was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:94) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:52) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:43) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:518) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.connectToDatanode(XceiverClientGrpc.java:191) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.connect(XceiverClientGrpc.java:140) > at > org.apache.hadoop.hdds.scm.XceiverClientManager$2.call(XceiverClientManager.java:244) > at > org.apache.hadoop.hdds.scm.XceiverClientManager$2.call(XceiverClientManager.java:228) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4876) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache.get(LocalCache.java:3951) > at > org.apache.hadoop.ozone.shaded.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4871) > at > org.apache.hadoop.hdds.scm.XceiverClientManager.getClient(XceiverClientManager.java:228) > at > org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:174) > at > org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:164) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:184) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:133) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:254) > at > org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:199) > at > org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:63) > at java.io.DataInputStream.read(DataInputStream.java:100) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator$DFSClientThread.read(LoadGenerator.java:284) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator$DFSClientThread.nextOp(LoadGenerator.java:268) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator$DFSClientThread.run(LoadGenerator.java:235) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3596) Clean up unused code after HDDS-2940 and HDDS-2942
[ https://issues.apache.org/jira/browse/HDDS-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3596: --- Priority: Blocker (was: Major) > Clean up unused code after HDDS-2940 and HDDS-2942 > -- > > Key: HDDS-3596 > URL: https://issues.apache.org/jira/browse/HDDS-3596 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Blocker > Labels: Triaged, pull-request-available > > It seems some snippets of code should be removed as HDDS-2940 is committed. > Update: Pending HDDS-2942 commit before this can be committed. > For example > [this|https://github.com/apache/hadoop-ozone/blob/ffb340e32460ccaa2eae557f0bb71fb90d7ebc7a/hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java#L495-L499]: > {code:java|title=BasicOzoneFileSystem#delete} > if (result) { > // If this delete operation removes all files/directories from the > // parent directory, then an empty parent directory must be created. > createFakeParentDirectory(f); > } > {code} > (Found at > https://github.com/apache/hadoop-ozone/pull/906#discussion_r424873030) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3599) Implement ofs://: Add contract test for HA
[ https://issues.apache.org/jira/browse/HDDS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3599: --- Priority: Blocker (was: Major) > Implement ofs://: Add contract test for HA > -- > > Key: HDDS-3599 > URL: https://issues.apache.org/jira/browse/HDDS-3599 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Blocker > Labels: Triaged, pull-request-available > > Add contract tests for HA as well. > Since adding HA contract tests will be another ~10 new classes. [~xyao] and I > decided to put HA OFS contract tests in another jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3600) ManagedChannels leaked on ratis pipeline when there are many connection retries
[ https://issues.apache.org/jira/browse/HDDS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3600: --- Priority: Critical (was: Major) > ManagedChannels leaked on ratis pipeline when there are many connection > retries > --- > > Key: HDDS-3600 > URL: https://issues.apache.org/jira/browse/HDDS-3600 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.6.0 >Reporter: Rakesh Radhakrishnan >Priority: Critical > Labels: TriagePending > Attachments: HeapHistogram-Snapshot-ManagedChannel-Leaked-001.png, > outloggenerator-ozonefs-003.log > > > ManagedChannels leaked on ratis pipeline when there are many connection > retries > Observed that too many ManagedChannels opened while running Synthetic Hadoop > load generator. > Ran benchmark with only one pipeline in the cluster and also ran with only > two pipelines in the cluster. > Both the run failed with too many open files and could see many open TCP > connections for long time and suspecting channel leaks.. > More details below: > *1)* Execute NNloadGenerator > {code:java} > [rakeshr@ve1320 loadOutput]$ ps -ef | grep load > hdfs 362822 1 19 05:24 pts/000:03:16 > /usr/java/jdk1.8.0_232-cloudera/bin/java -Dproc_jar -Xmx825955249 > -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true > -Dyarn.log.dir=/var/log/hadoop-yarn -Dyarn.log.file=hadoop.log > -Dyarn.home.dir=/opt/cloudera/parcels/CDH-7.2.0-1.cdh7.2.0.p0.2982244/lib/hadoop/libexec/../../hadoop-yarn > -Dyarn.root.logger=INFO,console > -Djava.library.path=/opt/cloudera/parcels/CDH-7.2.0-1.cdh7.2.0.p0.2982244/lib/hadoop/lib/native > -Dhadoop.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop.log > -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-7.2.0-1.cdh7.2.0.p0.2982244/lib/hadoop > -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console > -Dhadoop.policy.file=hadoop-policy.xml > -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar > /opt/cloudera/parcels/CDH-7.2.0-1.cdh7.2.0.p0.2982244/jars/hadoop-mapreduce-client-jobclient-3.1.1.7.2.0.0-141-tests.jar > NNloadGenerator -root o3fs://bucket2.vol2/ > rakeshr 368739 354174 0 05:41 pts/000:00:00 grep --color=auto load > {code} > *2)* Active 9858 TCP connections during the run, which is ratis pipeline > default port. > {code:java} > [rakeshr@ve1320 loadOutput]$ sudo lsof -a -p 362822 | grep "9858" | wc >3229 32290 494080 > [rakeshr@ve1320 loadOutput]$ vi tcp_log > > java440633 hdfs 4090u IPv4 271141987 0t0TCP > ve1320.halxg.cloudera.com:35190->ve1323.halxg.cloudera.com:9858 (ESTABLISHED) > java440633 hdfs 4091u IPv4 271127918 0t0TCP > ve1320.halxg.cloudera.com:35192->ve1323.halxg.cloudera.com:9858 (ESTABLISHED) > java440633 hdfs 4092u IPv4 271038583 0t0TCP > ve1320.halxg.cloudera.com:59116->ve1323.halxg.cloudera.com:9858 (ESTABLISHED) > java440633 hdfs 4093u IPv4 271038584 0t0TCP > ve1320.halxg.cloudera.com:59118->ve1323.halxg.cloudera.com:9858 (ESTABLISHED) > java440633 hdfs 4095u IPv4 271127920 0t0TCP > ve1320.halxg.cloudera.com:35196->ve1323.halxg.cloudera.com:9858 (ESTABLISHED) > [rakeshr@ve1320 loadOutput]$ ^C > {code} > *3)* heapdump shows there are 9571 ManagedChanel objects. Heapdump is quite > large and attached snapshot to this jira. > *4)* Attached output and threadump of the SyntheticLoadGenerator benchmark > client process to show the exceptions printed to the console. FYI, this file > was quite large and have trimmed few repeated exception traces.. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3611) Ozone client should not consider closed container error as failure
[ https://issues.apache.org/jira/browse/HDDS-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3611: --- Priority: Critical (was: Major) > Ozone client should not consider closed container error as failure > -- > > Key: HDDS-3611 > URL: https://issues.apache.org/jira/browse/HDDS-3611 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Critical > Labels: TriagePending > > ContainerNotOpen exception exception is thrown by datanode when client is > writing to a non open container. Currently ozone client sees this as failure > and would increment the retry count. If client reaches a configured retry > count it fails the write. Map reduce jobs were seen failing due to this error > with default retry count of 5. > Idea is to not consider errors due to closed container in retry count. This > would make sure that ozone client writes do not fail due to closed container > exceptions. > {code:java} > 2020-05-15 02:20:28,375 ERROR [main] > org.apache.hadoop.ozone.client.io.KeyOutputStream: Retry request failed. > retries get failed due to exceeded maximum allowed retries number: 5 > java.io.IOException: Unexpected Storage Container Exception: > java.util.concurrent.CompletionException: > java.util.concurrent.CompletionException: > org.apache.ratis.protocol.StateMachineException: > org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException > from Server e2eec12f-02c5-46e2-9c23-14d6445db219@group-A3BF3ABDC307: > Container 15 in CLOSED state > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.setIoException(BlockOutputStream.java:551) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$3(BlockOutputStream.java:638) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:99) > at > org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:60) > at > org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:143) > at > org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:314) > at > org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequest$9(OrderedAsync.java:242) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:284) > at java.util.Optional.ifPresent(Optional.java:159) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:340) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:264) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:284) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:267) > at > org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:436) > at > org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:658) > ...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3612) Allow mounting bucket under other volume
[ https://issues.apache.org/jira/browse/HDDS-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3612: --- Priority: Critical (was: Major) > Allow mounting bucket under other volume > > > Key: HDDS-3612 > URL: https://issues.apache.org/jira/browse/HDDS-3612 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Manager >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Critical > Labels: Triaged, pull-request-available > > Step 2 from S3 [volume mapping design > doc|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/docs/content/design/ozone-volume-management.md#solving-the-mapping-problem-2-4-from-the-problem-listing]: > Implement a bind mount mechanic which makes it possible to mount any > volume/buckets to the specific "s3" volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3619) OzoneManager fails with IllegalArgumentException for cmdType RenameKey
[ https://issues.apache.org/jira/browse/HDDS-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3619: --- Priority: Critical (was: Major) > OzoneManager fails with IllegalArgumentException for cmdType RenameKey > -- > > Key: HDDS-3619 > URL: https://issues.apache.org/jira/browse/HDDS-3619 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: HA, Ozone Manager >Reporter: Lokesh Jain >Priority: Critical > Labels: Triaged > > All Ozone Manager instances on startup fail with IllegalArgumentException for > command type RenameKey. > {code:java} > 2020-05-19 01:26:32,406 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: om1: installSnapshot > onError, lastRequest: om2->om1#4-t34, previous=(t:34, i:44118), > leaderCommit=44118, initializing? false, entries: size=1, first=(t:34, > i:44119), METADATAENTRY(c:44118): > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: > cancelled before receiving half close > 2020-05-19 01:26:33,521 ERROR > org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Terminating with > exit status 1: Request cmdType: RenameKey > traceID: "" > clientId: "client-E7949F1158CC" > userInfo { > userName: "h...@halxg.cloudera.com" > remoteAddress: "10.17.200.43" > hostName: "vb0933.halxg.cloudera.com" > } > renameKeyRequest { > keyArgs { > volumeName: "vol1" > bucketName: "bucket1" > keyName: "teragen/100G-terasort-input/" > dataSize: 0 > modificationTime: 1589872757030 > } > toKeyName: "user/ljain/.Trash/Current/teragen/100G-terasort-input/" > } > failed with exception > java.lang.IllegalArgumentException: Trying to set updateID to 35984 which is > not greater than the current value of 42661 for OMKeyInfo{volume='vol1', > bucket='bucket1', key='teragen/100G-terasort-input/', dataSize='0', > creationTime='1589876037688', type='RATIS', factor='ONE'} > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:142) > at > org.apache.hadoop.ozone.om.helpers.WithObjectID.setUpdateID(WithObjectID.java:107) > at > org.apache.hadoop.ozone.om.request.key.OMKeyRenameRequest.validateAndUpdateCache(OMKeyRenameRequest.java:213) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:226) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:428) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$applyTransaction$1(OzoneManagerStateMachine.java:242) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-05-19 01:26:33,526 INFO org.apache.hadoop.ozone.om.OzoneManagerStarter: > SHUTDOWN_MSG: > / > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-3709) Rebase OFS branch - 3
Siyao Meng created HDDS-3709: Summary: Rebase OFS branch - 3 Key: HDDS-3709 URL: https://issues.apache.org/jira/browse/HDDS-3709 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Siyao Meng -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-3709) Rebase OFS branch - 3
[ https://issues.apache.org/jira/browse/HDDS-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng reassigned HDDS-3709: Assignee: Siyao Meng > Rebase OFS branch - 3 > - > > Key: HDDS-3709 > URL: https://issues.apache.org/jira/browse/HDDS-3709 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3632) HddsDatanodeService cannot be started if HDFS datanode running in same machine with same user.
[ https://issues.apache.org/jira/browse/HDDS-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3632: --- Priority: Critical (was: Major) > HddsDatanodeService cannot be started if HDFS datanode running in same > machine with same user. > -- > > Key: HDDS-3632 > URL: https://issues.apache.org/jira/browse/HDDS-3632 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Uma Maheswara Rao G >Priority: Critical > Labels: Triaged, newbie > > since the service names are same and they both referring to same location for > pid files, we can not start both services at once. > Tweak is to export HADOOP_PID_DIR to different location after starting one > service and start other one. > It would be better to have different pid file names. > > > {noformat} > Umas-MacBook-Pro ozone-0.5.0-beta % bin/ozone --daemon start datanode > datanode is running as process 25167. Stop it first. > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3632) HddsDatanodeService cannot be started if HDFS datanode running in same machine with same user.
[ https://issues.apache.org/jira/browse/HDDS-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3632: --- Priority: Blocker (was: Critical) > HddsDatanodeService cannot be started if HDFS datanode running in same > machine with same user. > -- > > Key: HDDS-3632 > URL: https://issues.apache.org/jira/browse/HDDS-3632 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Uma Maheswara Rao G >Priority: Blocker > Labels: Triaged, newbie > > since the service names are same and they both referring to same location for > pid files, we can not start both services at once. > Tweak is to export HADOOP_PID_DIR to different location after starting one > service and start other one. > It would be better to have different pid file names. > > > {noformat} > Umas-MacBook-Pro ozone-0.5.0-beta % bin/ozone --daemon start datanode > datanode is running as process 25167. Stop it first. > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3627) Remove FilteredClassloader and replace with maven based hadoop2/hadoop3 ozonefs generation
[ https://issues.apache.org/jira/browse/HDDS-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3627: --- Priority: Blocker (was: Major) > Remove FilteredClassloader and replace with maven based hadoop2/hadoop3 > ozonefs generation > -- > > Key: HDDS-3627 > URL: https://issues.apache.org/jira/browse/HDDS-3627 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Blocker > Labels: pull-request-available > > As described in the parent issue, the final step is to create a Hadoop > independent shaded client and hadoop2/hadoop3 related separated client jars. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1939) Owner/group information for a file should be returned from OzoneFileStatus
[ https://issues.apache.org/jira/browse/HDDS-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng resolved HDDS-1939. -- Resolution: Not A Problem Turns out this problem goes away as a side effect of HDDS-3501 by [~elek]: https://github.com/apache/hadoop-ozone/commit/6fb2b3e3726284a41a20b5bc91387457f9625f33#diff-69498beacfd2dce93095ab6cd74e0571R459-R460 This was a problem because before HDDS-3501, OzoneFileStatus extends FileStatus but the former doesn't really store owner and group information. As a result, {{status.getOwner()}} retrieves [{{null}}|https://github.com/apache/hadoop/blob/fb8932a727f757b2e9c1c61a18145878d0eb77bd/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L54-L55]. As of now with HDDS-3501, owner and group field of FileStatus from o3fs will be [populated|https://github.com/apache/hadoop-ozone/blob/ffb340e32460ccaa2eae557f0bb71fb90d7ebc7a/hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java#L171-L176] with the current userName. Resolving this jira. > Owner/group information for a file should be returned from OzoneFileStatus > -- > > Key: HDDS-1939 > URL: https://issues.apache.org/jira/browse/HDDS-1939 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem, Security >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Siyao Meng >Priority: Critical > Labels: Triaged > > BasicOzoneFilesystem returns the file's user/group information as the current > user/group. This should default to the information read from the acl's for > the file. > cc [~xyao] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3694) Reduce dn-audit log
[ https://issues.apache.org/jira/browse/HDDS-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDDS-3694: --- Priority: Critical (was: Minor) > Reduce dn-audit log > --- > > Key: HDDS-3694 > URL: https://issues.apache.org/jira/browse/HDDS-3694 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Dinesh Chitlangia >Priority: Critical > Labels: Triaged, performance, pull-request-available > Attachments: write_to_dn_audit_causing_high_disk_util.png > > > Do we really need such fine grained audit log? This ends up creating too many > entries for chunks. > {noformat} > 2020-05-31 23:31:48,477 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324230275483 bcsId: 93943} > | ret=SUCCESS | > 2020-05-31 23:31:48,482 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267323565871437 bcsId: 93940} > | ret=SUCCESS | > 2020-05-31 23:31:48,487 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324230275483 bcsId: 93943} > | ret=SUCCESS | > 2020-05-31 23:31:48,497 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267324172472725 bcsId: 93934} > | ret=SUCCESS | > 2020-05-31 23:31:48,501 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267323675906396 bcsId: 93958} > | ret=SUCCESS | > 2020-05-31 23:31:48,504 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324230275483 bcsId: 93943} > | ret=SUCCESS | > 2020-05-31 23:31:48,509 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267323685343583 bcsId: 93974} > | ret=SUCCESS | > 2020-05-31 23:31:48,512 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267324172472725 bcsId: 93934} > | ret=SUCCESS | > 2020-05-31 23:31:48,516 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324332380586 bcsId: 0} | > ret=SUCCESS | > 2020-05-31 23:31:48,726 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267324232634780 bcsId: 93964} > | ret=SUCCESS | > 2020-05-31 23:31:48,733 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267323976323460 bcsId: 93967} > | ret=SUCCESS | > 2020-05-31 23:31:48,740 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324131512723 bcsId: 93952} > | ret=SUCCESS | > 2020-05-31 23:31:48,752 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267324230275483 bcsId: 93943} > | ret=SUCCESS | > 2020-05-31 23:31:48,760 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 165 locID: 104267323675906396 bcsId: 93958} > | ret=SUCCESS | > 2020-05-31 23:31:48,772 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 166 locID: 104267323685343583 bcsId: 93974} > | ret=SUCCESS | > 2020-05-31 23:31:48,780 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 164 locID: 104267324304724389 bcsId: 0} | > ret=SUCCESS | > 2020-05-31 23:31:48,787 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 164 locID: 104267323991724421 bcsId: 93970} > | ret=SUCCESS | > 2020-05-31 23:31:48,794 | INFO | DNAudit | user=null | ip=null | > op=WRITE_CHUNK {blockData=conID: 164 locID: 104267323725189479 bcsId: 93963} > | ret=SUCCESS | > {noformat} > And ends up choking disk utilization with lesser write/mb/sec. > Refer to 100+ writes being written with 0.52 MB/sec and choking entire disk > utilization. > !write_to_dn_audit_causing_high_disk_util.png|width=726,height=300! > > Also, the username and IP are currently set as null. This needs to be > replaced by using details from grpc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3566) Acceptance test ozone-mr and ozonesecure-mr do not succeed when run locally
[ https://issues.apache.org/jira/browse/HDDS-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3566: Target Version/s: 0.7.0 (was: 0.6.0) > Acceptance test ozone-mr and ozonesecure-mr do not succeed when run locally > --- > > Key: HDDS-3566 > URL: https://issues.apache.org/jira/browse/HDDS-3566 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Lokesh Jain >Priority: Major > > Map reduce related acceptance tests like ozone-mr and ozonesecure-mr do not > succeed when run locally. The issue seems to be due to low resources in the > system as the tests pass when run on system with higher number of cpus and > ram. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3376) Exclude unwanted jars from Ozone Filesystem jar
[ https://issues.apache.org/jira/browse/HDDS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3376: Target Version/s: 0.7.0 (was: 0.6.0) > Exclude unwanted jars from Ozone Filesystem jar > --- > > Key: HDDS-3376 > URL: https://issues.apache.org/jira/browse/HDDS-3376 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.5.0 >Reporter: Vivek Ratnavel Subramanian >Priority: Major > Labels: TriagePending > > This is a followup Jira to HDDS-3368 to clean up unwanted jars like jackson > being packaged with Ozone Filesystem jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3567) Make replicationfactor can be configurable to any number
[ https://issues.apache.org/jira/browse/HDDS-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124447#comment-17124447 ] Arpit Agarwal commented on HDDS-3567: - [~maobaolong] it would be good to have a design discussion about this sometime. We can do that on the ozone-dev community mailing list. The write pipeline is the most complex piece of Ozone, so we should understand the changes carefully before proceeding. I also think we need a more detailed design proposal given the potential complexity of this task. > Make replicationfactor can be configurable to any number > > > Key: HDDS-3567 > URL: https://issues.apache.org/jira/browse/HDDS-3567 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: om, Ozone CLI, Ozone Datanode, Ozone Manager, SCM >Affects Versions: 0.6.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > > As there is a feature request that Ozone should support any replication > number of a file, we have the following subtask to do. > The following is a simple design document. > https://docs.qq.com/doc/DV2N6bWdCcnJVc3Rk -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1409) TestOzoneClientRetriesOnException is flaky
[ https://issues.apache.org/jira/browse/HDDS-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1409: Target Version/s: 0.6.0 Labels: TriagePending ozone-flaky-test (was: ozone-flaky-test) > TestOzoneClientRetriesOnException is flaky > -- > > Key: HDDS-1409 > URL: https://issues.apache.org/jira/browse/HDDS-1409 > Project: Hadoop Distributed Data Store > Issue Type: Test > Components: test >Reporter: Nanda kumar >Priority: Major > Labels: TriagePending, ozone-flaky-test > > TestOzoneClientRetriesOnException is flaky, we get the below exception when > it fails. > {noformat} > [ERROR] > testMaxRetriesByOzoneClient(org.apache.hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException) > Time elapsed: 16.227 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException.testMaxRetriesByOzoneClient(TestOzoneClientRetriesOnException.java:197) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2708) Translate docs to Chinese
[ https://issues.apache.org/jira/browse/HDDS-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2708: Target Version/s: 0.7.0 > Translate docs to Chinese > - > > Key: HDDS-2708 > URL: https://issues.apache.org/jira/browse/HDDS-2708 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: documentation, upgrade >Reporter: Xiang Zhang >Assignee: Xiang Zhang >Priority: Major > Labels: Triaged > > According to > [https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors], > I understand that Chinese docs are needed. I am interested in this, could > somebody give me some advice to get started ? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3689) Add various profiles to MiniOzoneChaosCluster to run different modes
[ https://issues.apache.org/jira/browse/HDDS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3689: Target Version/s: 0.7.0 > Add various profiles to MiniOzoneChaosCluster to run different modes > > > Key: HDDS-3689 > URL: https://issues.apache.org/jira/browse/HDDS-3689 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > Add various profiles to MiniOzoneChaosCluster to run different modes. This > will help in running different modes easily from MiniOzoneChaosCluster shell > script -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3601) Refactor TestOzoneManagerHA.java into multiple tests to avoid frequent timeout issues
[ https://issues.apache.org/jira/browse/HDDS-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3601: Target Version/s: 0.6.0 Labels: TriagePending pull-request-available (was: pull-request-available) > Refactor TestOzoneManagerHA.java into multiple tests to avoid frequent > timeout issues > - > > Key: HDDS-3601 > URL: https://issues.apache.org/jira/browse/HDDS-3601 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: TriagePending, pull-request-available > > Refactor TestOzoneManagerHA.java into multiple tests to avoid frequent > timeout issues -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3542) ozone chunkinfo CLI cannot connect to OM when run from NON OM leader client node
[ https://issues.apache.org/jira/browse/HDDS-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3542: Target Version/s: 0.7.0 > ozone chunkinfo CLI cannot connect to OM when run from NON OM leader client > node > > > Key: HDDS-3542 > URL: https://issues.apache.org/jira/browse/HDDS-3542 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Sadanand Shenoy >Assignee: Sadanand Shenoy >Priority: Major > Labels: TriagePending, pull-request-available > > When ozone debug chunkinfo command is run from a client node where OM leader > is not present, the operation fails. > When run from the client node where OM leader is present, it works fine > {quote}/opt/cloudera/parcels/CDH/bin/ozone debug chunkinfo > o3://ozone1/vol1/buck1/file1 20/04/21 11:04:02 INFO ipc.Client: Retrying > connect to server: 0.0.0.0/0.0.0.0:9862. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 > MILLISECONDS) 20/04/21 11:04:03 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:9862. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 > MILLISECONDS) 20/04/21 11:04:04 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:9862. Already tried 2 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 > MILLISECONDS) 20/04/21 11:04:05 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:9862. Already tried 3 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 > MILLISECONDS) 20/04/21 11:04:06 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:9862. Already tried 4 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 > MILLISECONDS) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3378) OzoneManager group init failed because of incorrect snapshot directory location
[ https://issues.apache.org/jira/browse/HDDS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3378: Target Version/s: 0.7.0 Affects Version/s: (was: 0.7.0) > OzoneManager group init failed because of incorrect snapshot directory > location > --- > > Key: HDDS-3378 > URL: https://issues.apache.org/jira/browse/HDDS-3378 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager, test >Reporter: Mukul Kumar Singh >Assignee: YiSheng Lien >Priority: Major > Labels: MiniOzoneChaosCluster > > OzoneManager group init failed because of incorrect snapshot directory > location > {code} > 2020-04-11 20:07:57,180 [pool-59-thread-1] INFO server.RaftServerConfigKeys > (ConfUtils.java:logGet(44)) - raft.server.storage.dir = > [/tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis] > (custom) > 2020-04-11 20:07:57,180 [pool-59-thread-1] INFO impl.RaftServerProxy > (RaftServerProxy.java:lambda$null$0(191)) - omNode-3: found a subdirectory > /tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis/snapshot > 2020-04-11 20:07:57,181 [pool-59-thread-1] WARN impl.RaftServerProxy > (RaftServerProxy.java:lambda$null$0(197)) - omNode-3: Failed to initialize > the group directory > /tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis/snapshot. > Ignoring it > java.lang.IllegalArgumentException: Invalid UUID string: snapshot > at java.util.UUID.fromString(UUID.java:194) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$null$0(RaftServerProxy.java:192) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$initGroups$1(RaftServerProxy.java:189) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291) > at > java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) > at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401) > at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734) > at > java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583) > at > org.apache.ratis.server.impl.RaftServerProxy.initGroups(RaftServerProxy.java:186) > at > org.apache.ratis.server.impl.ServerImplUtils.newRaftServer(ServerImplUtils.java:41) > at > org.apache.ratis.server.RaftServer$Builder.build(RaftServer.java:76) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.(OzoneManagerRatisServer.java:277) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.newOMRatisServer(OzoneManagerRatisServer.java:328) > at > org.apache.hadoop.ozone.om.OzoneManager.initializeRatisServer(OzoneManager.java:1249) > at > org.apache.hadoop.ozone.om.OzoneManager.restart(OzoneManager.java:1190) > at > org.apache.hadoop.ozone.MiniOzoneHAClusterImpl.restartOzoneManager(MiniOzoneHAClusterImpl.java:229) > at > org.apache.hadoop.ozone.failure.Failures$OzoneManagerRestartFailure.lambda$fail$0(Failures.java:112) > at >
[jira] [Updated] (HDDS-3381) OzoneManager starts 2 OzoneManagerDoubleBuffer for HA clusters
[ https://issues.apache.org/jira/browse/HDDS-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3381: Target Version/s: 0.6.0 Labels: MiniOzoneChaosCluster Triaged pull-request-available (was: MiniOzoneChaosCluster pull-request-available) > OzoneManager starts 2 OzoneManagerDoubleBuffer for HA clusters > -- > > Key: HDDS-3381 > URL: https://issues.apache.org/jira/browse/HDDS-3381 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager, test >Reporter: Mukul Kumar Singh >Assignee: Bharat Viswanadham >Priority: Major > Labels: MiniOzoneChaosCluster, Triaged, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > OzoneManager starts 2 OzoneManagerDoubleBuffer for HA clusters. In the > following example for 3 OM HA instances, 6 OzoneManagerDoubleBuffer instances > were created. > {code} > ➜ chaos-2020-04-12-20-21-11-IST grep canFlush stack1 > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.canFlush(OzoneManagerDoubleBuffer.java:344) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3378) OzoneManager group init failed because of incorrect snapshot directory location
[ https://issues.apache.org/jira/browse/HDDS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3378: Affects Version/s: (was: 0.6.0) 0.7.0 > OzoneManager group init failed because of incorrect snapshot directory > location > --- > > Key: HDDS-3378 > URL: https://issues.apache.org/jira/browse/HDDS-3378 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager, test >Affects Versions: 0.7.0 >Reporter: Mukul Kumar Singh >Assignee: YiSheng Lien >Priority: Major > Labels: MiniOzoneChaosCluster > > OzoneManager group init failed because of incorrect snapshot directory > location > {code} > 2020-04-11 20:07:57,180 [pool-59-thread-1] INFO server.RaftServerConfigKeys > (ConfUtils.java:logGet(44)) - raft.server.storage.dir = > [/tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis] > (custom) > 2020-04-11 20:07:57,180 [pool-59-thread-1] INFO impl.RaftServerProxy > (RaftServerProxy.java:lambda$null$0(191)) - omNode-3: found a subdirectory > /tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis/snapshot > 2020-04-11 20:07:57,181 [pool-59-thread-1] WARN impl.RaftServerProxy > (RaftServerProxy.java:lambda$null$0(197)) - omNode-3: Failed to initialize > the group directory > /tmp/chaos-2020-04-11-20-05-25-IST/MiniOzoneClusterImpl-80aafc97-1b12-4bc0-9baf-7f42185b0995/omNode-3/ratis/snapshot. > Ignoring it > java.lang.IllegalArgumentException: Invalid UUID string: snapshot > at java.util.UUID.fromString(UUID.java:194) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$null$0(RaftServerProxy.java:192) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$initGroups$1(RaftServerProxy.java:189) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291) > at > java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) > at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401) > at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734) > at > java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583) > at > org.apache.ratis.server.impl.RaftServerProxy.initGroups(RaftServerProxy.java:186) > at > org.apache.ratis.server.impl.ServerImplUtils.newRaftServer(ServerImplUtils.java:41) > at > org.apache.ratis.server.RaftServer$Builder.build(RaftServer.java:76) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.(OzoneManagerRatisServer.java:277) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.newOMRatisServer(OzoneManagerRatisServer.java:328) > at > org.apache.hadoop.ozone.om.OzoneManager.initializeRatisServer(OzoneManager.java:1249) > at > org.apache.hadoop.ozone.om.OzoneManager.restart(OzoneManager.java:1190) > at > org.apache.hadoop.ozone.MiniOzoneHAClusterImpl.restartOzoneManager(MiniOzoneHAClusterImpl.java:229) > at > org.apache.hadoop.ozone.failure.Failures$OzoneManagerRestartFailure.lambda$fail$0(Failures.java:112) >
[jira] [Updated] (HDDS-3330) TestDeleteWithSlowFollower is still flaky
[ https://issues.apache.org/jira/browse/HDDS-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3330: Target Version/s: 0.6.0 Labels: TriagePending ozone-flaky-test (was: ) > TestDeleteWithSlowFollower is still flaky > - > > Key: HDDS-3330 > URL: https://issues.apache.org/jira/browse/HDDS-3330 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Marton Elek >Assignee: Shashikant Banerjee >Priority: Major > Labels: TriagePending, ozone-flaky-test > > {code} > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 666.209 s <<< > FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower > testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower) > Time elapsed: 640.745 s <<< ERROR! > java.io.IOException: INTERNAL_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks. > Requested 1 blocks > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:229) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:402) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:347) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:458) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:509) > at > org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60) > at > org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) > {code} > I learned this from [~shashikant] > bq. we kill a datanode after some IO, SCM is out of safe mode by then . SCM > takes time to destroy a pipeline and form a new one > bq. With only minimal set of dn in cluster, if we want to write again, we > need to wait for a new pipeline to open up before writing again > Will turn off this test until the fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3276) Multiple leasemanager timeout exception while running MiniOzoneChaosCluster
[ https://issues.apache.org/jira/browse/HDDS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3276: Target Version/s: 0.6.0 Labels: MiniOzoneChaosCluster TriagePending (was: MiniOzoneChaosCluster) > Multiple leasemanager timeout exception while running MiniOzoneChaosCluster > --- > > Key: HDDS-3276 > URL: https://issues.apache.org/jira/browse/HDDS-3276 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, TriagePending > > Able to see the following while running the mini ozone chaos cluster. > {code} > 020-03-25 15:44:50,101 [CommandWatcher-LeaseManager#LeaseMonitor] ERROR > lease.LeaseManager (LeaseManager.java:run(238)) - Execution was interrupted > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.ozone.lease.LeaseManager$LeaseMonitor.run(LeaseManager.java:234) > at java.lang.Thread.run(Thread.java:748) > 2020-03-25 15:44:50,120 [CommandWatcher-LeaseManager#LeaseMonitor] ERROR > lease.LeaseManager (LeaseManager.java:run(238)) - Execution was interrupted > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.ozone.lease.LeaseManager$LeaseMonitor.run(LeaseManager.java:234) > at java.lang.Thread.run(Thread.java:748) > 2020-03-25 15:44:51,106 [CommandWatcher-LeaseManager#LeaseMonitor] ERROR > lease.LeaseManager (LeaseManager.java:run(238)) - Execution was interrupted > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.ozone.lease.LeaseManager$LeaseMonitor.run(LeaseManager.java:234) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3216) Revisit all the flags OzoneContract.xml tests to make sure all the contract options are covered
[ https://issues.apache.org/jira/browse/HDDS-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3216: Target Version/s: 0.7.0 > Revisit all the flags OzoneContract.xml tests to make sure all the contract > options are covered > --- > > Key: HDDS-3216 > URL: https://issues.apache.org/jira/browse/HDDS-3216 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Priority: Major > > Revisit all the flags OzoneContract tests xml at > https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/integration-test/src/test/resources/contract/ozone.xml. > We need to ensure that all the options in Contract tests are covered > https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractOptions.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3195) Make number of threads in BlockDeletingService configurable
[ https://issues.apache.org/jira/browse/HDDS-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3195: Component/s: (was: test) > Make number of threads in BlockDeletingService configurable > --- > > Key: HDDS-3195 > URL: https://issues.apache.org/jira/browse/HDDS-3195 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > Number of threads in BlockDeletingService is set to 10. This jira proposes to > make this value configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3195) Make number of threads in BlockDeletingService configurable
[ https://issues.apache.org/jira/browse/HDDS-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-3195: Target Version/s: 0.7.0 > Make number of threads in BlockDeletingService configurable > --- > > Key: HDDS-3195 > URL: https://issues.apache.org/jira/browse/HDDS-3195 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > Number of threads in BlockDeletingService is set to 10. This jira proposes to > make this value configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2998) Improve test coverage of audit logging
[ https://issues.apache.org/jira/browse/HDDS-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2998: Target Version/s: 0.7.0 > Improve test coverage of audit logging > -- > > Key: HDDS-2998 > URL: https://issues.apache.org/jira/browse/HDDS-2998 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Istvan Fajth >Assignee: Istvan Fajth >Priority: Major > Labels: Triaged, newbie++ > > Review audit logging tests, and add assertions about the different audit log > contents we expect to have in the audit log. > A good place to start with is TestOMKeyRequest where we create an audit > logger mock, via that one most likely the assertions can be done for all the > requests. > This is a follow up on HDDS-2946. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2794) Failed to close QUASI_CLOSED container
[ https://issues.apache.org/jira/browse/HDDS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2794: Target Version/s: 0.6.0 > Failed to close QUASI_CLOSED container > -- > > Key: HDDS-2794 > URL: https://issues.apache.org/jira/browse/HDDS-2794 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Sammi Chen >Priority: Critical > Labels: TriagePending > > 2019-12-24 20:19:55,154 INFO > org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process > replica:ContainerReplica{containerID=#283, > datanodeDetails=ed90869c-317e-4303-8922-9fa83a3983cb{ip: 10.120.113.172, > host: host172, networkLocation: /rack2, certSerialId: null}, > placeOfBirth=ed90869c-317e-4303-8922-9fa83a3983cb, sequenceId=2342, > state=QUASI_CLOSED} > 2019-12-24 20:20:02,258 INFO > org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process > replica:ContainerReplica{containerID=#283, > datanodeDetails=1da74a1d-f64d-4ad4-b04c-85f26687e683{ip: 10.121.124.44, host: > host044, networkLocation: /rack2, certSerialId: null}, > placeOfBirth=1da74a1d-f64d-4ad4-b04c-85f26687e683, sequenceId=2209, > state=UNHEALTHY} > 2019-12-24 20:20:03,167 INFO > org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process > replica:ContainerReplica{containerID=#283, > datanodeDetails=b65b0b6c-b0bb-429f-a23d-467c72d4b85c{ip: 10.120.139.111, > host: host111, networkLocation: /rack1, certSerialId: null}, > placeOfBirth=b65b0b6c-b0bb-429f-a23d-467c72d4b85c, sequenceId=2209, > state=UNHEALTHY} > 2019-12-24 20:20:03,168 INFO > org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close > container Event triggered for container : #283 > 2019-12-24 20:20:03,169 WARN > org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Cannot close > container #283, which is in QUASI_CLOSED state. > ozone scmcli container list -s=283 > { > "state" : "QUASI_CLOSED", > "replicationFactor" : "THREE", > "replicationType" : "RATIS", > "usedBytes" : 872715244, > "numberOfKeys" : 9, > "lastUsed" : 14385015083, > "stateEnterTime" : 14313955037, > "owner" : "d0e31665-ba27-45ad-b576-67cd1bccc50b", > "containerID" : 283, > "deleteTransactionId" : 0, > "sequenceId" : 0, > "open" : false > } > ozone scmcli container info 283 > Loaded properties from hadoop-metrics2.properties > Scheduled Metric snapshot period at 10 second(s). > XceiverClientMetrics metrics system started > Container id: 283 > Container State: CLOSED > Container Path: > /data5/hdds/df508c61-3ae7-413f-ab9d-e00d9125de70/current/containerDir0/283/metadata > Container Metadata: > Datanodes: [host172] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2807) Fail Unit Test: TestMiniChaosOzoneCluster
[ https://issues.apache.org/jira/browse/HDDS-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2807: Target Version/s: 0.7.0 > Fail Unit Test: TestMiniChaosOzoneCluster > - > > Key: HDDS-2807 > URL: https://issues.apache.org/jira/browse/HDDS-2807 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: image-2019-12-25-21-16-25-372.png > > Time Spent: 10m > Remaining Estimate: 0h > > I run the unit test in docker: elek/ozone-build:20191106-1 on my machine. But > the unit test TestMiniChaosOzoneCluster cannot pass. The related message in > hadoop-ozone/fault-injection-test/mini-chaos-tests/target/surefire-reports/org.apache.hadoop.ozone.TestMiniChaosOzoneCluster-output.txt > are as follows: > 2019-12-25 15:38:20,747 [pool-244-thread-5] WARN io.KeyOutputStream > (KeyOutputStream.java:handleException(280)) - Encountered exception > java.io.IOException: Unexpected Storage Container Exception: > java.util.concurrent.CompletionException: > java.util.concurrent.CompletionException: > org.apache.ratis.protocol.AlreadyClosedException: > SlidingWindow$Client:client-C946713E1023->RAFT is closed. on the pipeline > Pipeline[ Id: 0ff487a6-5734-4ec6-babd-156a65d321dc, Nodes: > 4dbb8a5a-3a9a-42d5-bbf9-9c65f4703da2\{ip: 10.10.10.10, host: 10.10.10.10, > networkLocation: /default-rack, certSerialId: > null}36059332-e77c-4d4c-a133-ad28b3db004b\{ip: 10.10.10.10, host: > 10.10.10.10, networkLocation: /default-rack, certSerialId: > null}5c95288c-1710-49a2-a896-55f5568462e2\{ip: 10.10.10.10, host: > 10.10.10.10, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, > Factor:THREE, State:OPEN, leaderId:36059332-e77c-4d4c-a133-ad28b3db004b ]. > The last committed block length is 0, uncommitted data length is 8192 retry > count 0 > > !image-2019-12-25-21-16-25-372.png! > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2964) Fix @Ignore-d integration tests
[ https://issues.apache.org/jira/browse/HDDS-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2964: Target Version/s: 0.6.0 Labels: TriagePending (was: ) > Fix @Ignore-d integration tests > --- > > Key: HDDS-2964 > URL: https://issues.apache.org/jira/browse/HDDS-2964 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Marton Elek >Priority: Major > Labels: TriagePending > > We marked all the intermittent unit tests with @Ignore to get reliable > feedback from CI builds. > Before HDDS-2833 we had 21 @Ignore annotations, HDDS-2833 introduced 34 new > one. > We need to review all of these tests and either fix, or delete or convert > them to real unit tests. > The current list of ignore tests: > {code:java} > hadoop-hdds/server-scm > org/apache/hadoop/hdds/scm/node/TestContainerPlacement.java: @Ignore > hadoop-hdds/server-scm > org/apache/hadoop/hdds/scm/node/TestDeadNodeHandler.java: @Ignore("Tracked > by HDDS-2508.") > hadoop-hdds/server-scm > org/apache/hadoop/hdds/scm/node/TestSCMNodeManager.java: @Ignore > hadoop-hdds/server-scm > org/apache/hadoop/hdds/scm/node/TestSCMNodeManager.java: @Ignore > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/container/TestContainerStateManagerIntegration.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/container/TestContainerStateManagerIntegration.java: > @Ignore("TODO:HDDS-1159") > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/pipeline/TestNodeFailure.java: @Ignore > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/pipeline/TestNodeFailure.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/pipeline/TestRatisPipelineCreateAndDestroy.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/hdds/scm/safemode/TestSCMSafeModeWithPipelineRules.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/Test2WayCommitInRatis.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestBlockOutputStreamWithFailures.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestCloseContainerHandlingByClient.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestCloseContainerHandlingByClient.java: > @Ignore // test needs to be fixed after close container is handled for > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestCommitWatcher.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestContainerReplicationEndToEnd.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestContainerStateMachineFailures.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestContainerStateMachine.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestFailureHandlingByClient.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestMultiBlockWritesWithDnFailures.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneAtRestEncryption.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneClientRetriesOnException.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java: @Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java: > @Ignore("Debug Jenkins Timeout") > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientForAclAuditLog.java:@Ignore("Fix > this after adding audit support for HA Acl code. This will be " + > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientWithRatis.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestSecureOzoneRpcClient.java: > @Ignore("Needs to be moved out of this class as client setup is static") > hadoop-ozone/integration-test > org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestCloseContainerByPipeline.java:@Ignore > hadoop-ozone/integration-test > org/apache/hadoop/ozone/container/common/transport/server/ratis/TestCSMMetrics.java:@Ignore > hadoop-ozone/integration-test >
[jira] [Updated] (HDDS-2088) Different components in MiniOzoneChaosCluster should log to different files
[ https://issues.apache.org/jira/browse/HDDS-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2088: Target Version/s: 0.7.0 (was: 0.5.0) > Different components in MiniOzoneChaosCluster should log to different files > --- > > Key: HDDS-2088 > URL: https://issues.apache.org/jira/browse/HDDS-2088 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Priority: Major > > Different components/nodes in MiniOzoneChaosCluster should log to different > log files. > Thanks [~shashikant] for suggesting this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2246) Reduce runtime of TestBlockOutputStreamWithFailures
[ https://issues.apache.org/jira/browse/HDDS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2246: Target Version/s: 0.7.0 (was: 0.5.0) > Reduce runtime of TestBlockOutputStreamWithFailures > --- > > Key: HDDS-2246 > URL: https://issues.apache.org/jira/browse/HDDS-2246 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Priority: Major > > {{TestBlockOutputStreamWithFailures}} is taking 10 minutes to run, we should > reduce the runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2760) Intermittent timeout in TestCloseContainerEventHandler
[ https://issues.apache.org/jira/browse/HDDS-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2760: Target Version/s: 0.6.0 Labels: TriagePending ozone-flaky-test (was: ) > Intermittent timeout in TestCloseContainerEventHandler > -- > > Key: HDDS-2760 > URL: https://issues.apache.org/jira/browse/HDDS-2760 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Priority: Minor > Labels: TriagePending, ozone-flaky-test > > TestCloseContainerEventHandler depends on wall clock and fails intermittently: > {code} > 2019-12-17T11:29:56.1873334Z [INFO] Running > org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler > 2019-12-17T11:31:10.0593259Z [ERROR] Tests run: 4, Failures: 1, Errors: 0, > Skipped: 0, Time elapsed: 71.343 s <<< FAILURE! - in > org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler > 2019-12-17T11:31:10.0604096Z [ERROR] > testCloseContainerEventWithRatis(org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler) > Time elapsed: 66.214 s <<< FAILURE! > 2019-12-17T11:31:10.0604347Z java.lang.AssertionError: Messages are not > processed in the given timeframe. Queued: 5 Processed: 0 > 2019-12-17T11:31:10.0614937Z at > org.apache.hadoop.hdds.server.events.EventQueue.processAll(EventQueue.java:238) > 2019-12-17T11:31:10.0616610Z at > org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler.testCloseContainerEventWithRatis(TestCloseContainerEventHandler.java:149) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2384) Large chunks during write can have memory pressure on DN with multiple clients
[ https://issues.apache.org/jira/browse/HDDS-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2384: Target Version/s: 0.6.0 > Large chunks during write can have memory pressure on DN with multiple clients > -- > > Key: HDDS-2384 > URL: https://issues.apache.org/jira/browse/HDDS-2384 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Shashikant Banerjee >Priority: Major > Labels: Triaged, performance > > During large file writes, it ends up writing {{16 MB}} chunks. > https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java#L691 > In large clusters, 100s of clients may connect to DN. In such cases, > depending on the incoming write workload mem load on DN can increase > significantly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2083) Fix TestQueryNode#testStaleNodesCount
[ https://issues.apache.org/jira/browse/HDDS-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2083: Target Version/s: 0.6.0 Labels: TriagePending (was: ) > Fix TestQueryNode#testStaleNodesCount > - > > Key: HDDS-2083 > URL: https://issues.apache.org/jira/browse/HDDS-2083 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Dinesh Chitlangia >Priority: Major > Labels: TriagePending > Attachments: stacktrace.rtf > > > It appears this test is failing due to several threads in waiting state. > Attached complete stack trace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2082) Fix flaky TestContainerStateMachineFailures#testApplyTransactionFailure
[ https://issues.apache.org/jira/browse/HDDS-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2082: Target Version/s: 0.6.0 Description: {code:java} --- Test set: org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures --- Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 102.615 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures testApplyTransactionFailure(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures) Time elapsed: 15.677 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures.testApplyTransactionFailure(TestContainerStateMachineFailures.java:349) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} was: {code:java} --- Test set: org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures --- Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 102.615 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures testApplyTransactionFailure(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures) Time elapsed: 15.677 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures.testApplyTransactionFailure(TestContainerStateMachineFailures.java:349) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at
[jira] [Updated] (HDDS-2085) TestBlockManager#testMultipleBlockAllocationWithClosedContainer timed out
[ https://issues.apache.org/jira/browse/HDDS-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-2085: Target Version/s: 0.6.0 Description: {code:java} --- Test set: org.apache.hadoop.hdds.scm.block.TestBlockManager --- Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 7.697 s <<< FAILURE! - in org.apache.hadoop.hdds.scm.block.TestBlockManager testMultipleBlockAllocationWithClosedContainer(org.apache.hadoop.hdds.scm.block.TestBlockManager) Time elapsed: 3.619 s <<< ERROR! java.util.concurrent.TimeoutException: Timed out waiting for condition. Thread diagnostics: Timestamp: 2019-09-03 08:46:46,870 "Socket Reader #1 for port 32840" prio=5 tid=14 runnable java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1097) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1076) "Socket Reader #1 for port 43576" prio=5 tid=22 runnable java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1097) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1076) "surefire-forkedjvm-command-thread" daemon prio=5 tid=8 runnable java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.maven.surefire.booter.MasterProcessCommand.decode(MasterProcessCommand.java:115) at org.apache.maven.surefire.booter.CommandReader$CommandRunnable.run(CommandReader.java:390) at java.lang.Thread.run(Thread.java:748) "surefire-forkedjvm-ping-30s" daemon prio=5 tid=9 timed_waiting java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) "Thread-15" daemon prio=5 tid=30 timed_waiting java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hdds.scm.safemode.SafeModeHandler.lambda$onMessage$0(SafeModeHandler.java:114) at org.apache.hadoop.hdds.scm.safemode.SafeModeHandler$$Lambda$33/1541519391.run(Unknown Source) at java.lang.Thread.run(Thread.java:748) "process reaper" daemon prio=10 tid=10 timed_waiting java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at
[jira] [Updated] (HDDS-1967) TestBlockOutputStreamWithFailures is flaky
[ https://issues.apache.org/jira/browse/HDDS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1967: Target Version/s: 0.6.0 Labels: TriagePending ozone-flaky-test (was: ozone-flaky-test) > TestBlockOutputStreamWithFailures is flaky > -- > > Key: HDDS-1967 > URL: https://issues.apache.org/jira/browse/HDDS-1967 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Priority: Major > Labels: TriagePending, ozone-flaky-test > Attachments: > TEST-org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.xml, > > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures-output.txt, > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.txt > > > {{TestBlockOutputStreamWithFailures}} is flaky. > {noformat} > [ERROR] > test2DatanodesFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures) > Time elapsed: 23.816 s <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.test2DatanodesFailure(TestBlockOutputStreamWithFailures.java:425) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} > {noformat} > [ERROR] > testWatchForCommitDatanodeFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures) > Time elapsed: 30.895 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.testWatchForCommitDatanodeFailure(TestBlockOutputStreamWithFailures.java:366) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Updated] (HDDS-1967) TestBlockOutputStreamWithFailures is flaky
[ https://issues.apache.org/jira/browse/HDDS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1967: Labels: ozone-flaky-test (was: ) > TestBlockOutputStreamWithFailures is flaky > -- > > Key: HDDS-1967 > URL: https://issues.apache.org/jira/browse/HDDS-1967 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Priority: Major > Labels: ozone-flaky-test > Attachments: > TEST-org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.xml, > > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures-output.txt, > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.txt > > > {{TestBlockOutputStreamWithFailures}} is flaky. > {noformat} > [ERROR] > test2DatanodesFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures) > Time elapsed: 23.816 s <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.test2DatanodesFailure(TestBlockOutputStreamWithFailures.java:425) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} > {noformat} > [ERROR] > testWatchForCommitDatanodeFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures) > Time elapsed: 30.895 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.testWatchForCommitDatanodeFailure(TestBlockOutputStreamWithFailures.java:366) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at >
[jira] [Updated] (HDDS-1681) TestNodeReportHandler failing because of NPE
[ https://issues.apache.org/jira/browse/HDDS-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1681: Labels: TriagePending (was: ) > TestNodeReportHandler failing because of NPE > > > Key: HDDS-1681 > URL: https://issues.apache.org/jira/browse/HDDS-1681 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > Labels: TriagePending > > {code} > [INFO] Running org.apache.hadoop.hdds.scm.node.TestNodeReportHandler > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.469 > s <<< FAILURE! - in org.apache.hadoop.hdds.scm.node.TestNodeReportHandler > [ERROR] testNodeReport(org.apache.hadoop.hdds.scm.node.TestNodeReportHandler) > Time elapsed: 0.31 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:122) > at > org.apache.hadoop.hdds.scm.node.TestNodeReportHandler.resetEventCollector(TestNodeReportHandler.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1889) Add support for verifying multiline log entry
[ https://issues.apache.org/jira/browse/HDDS-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1889: Target Version/s: 0.7.0 > Add support for verifying multiline log entry > - > > Key: HDDS-1889 > URL: https://issues.apache.org/jira/browse/HDDS-1889 > Project: Hadoop Distributed Data Store > Issue Type: Test > Components: test >Reporter: Dinesh Chitlangia >Priority: Major > Labels: newbie > Attachments: image.png > > > This jira aims to test the failure scenario where a multi-line stack trace > will be added in the audit log. Currently, for test assumes that even in > failure scenario we don't have multi-line log entry. > Example: > {code:java} > private static final AuditMessage READ_FAIL_MSG = > new AuditMessage.Builder() > .setUser("john") > .atIp("192.168.0.1") > .forOperation(DummyAction.READ_VOLUME.name()) > .withParams(PARAMS) > .withResult(FAILURE) > .withException(null).build(); > {code} > Therefore in verifyLog() we only compare for first line of the log file with > the expected message. > The test would fail if in future someone were to create a scenario with > multi-line log entry. > 1. Update READ_FAIL_MSG so that it has multiple lines of Exception stack > trace. > This is what multi-line log entry could look like: > {code:java} > ERROR | OMAudit | user=dchitlangia | ip=127.0.0.1 | op=GET_ACL > {volume=volume80100, bucket=bucket83878, key=null, aclType=CREATE, > resourceType=volume, storeType=ozone} | ret=FAILURE > org.apache.hadoop.ozone.om.exceptions.OMException: User dchitlangia doesn't > have CREATE permission to access volume > at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1809) > ~[classes/:?] > at org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1769) > ~[classes/:?] > at > org.apache.hadoop.ozone.om.OzoneManager.createBucket(OzoneManager.java:2092) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.createBucket(OzoneManagerRequestHandler.java:526) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handle(OzoneManagerRequestHandler.java:185) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:192) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:110) > ~[classes/:?] > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > ~[classes/:?] > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > ~[hadoop-common-3.2.0.jar:?] > at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > ~[hadoop-common-3.2.0.jar:?] > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > ~[hadoop-common-3.2.0.jar:?] > {code} > 2. Update verifyLog method to accept variable number of arguments. > 3. Update the assertion so that it compares beyond the first line when the > expected is a multi-line log entry. > {code:java} > assertTrue(expected.equalsIgnoreCase(lines.get(0))); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1681) TestNodeReportHandler failing because of NPE
[ https://issues.apache.org/jira/browse/HDDS-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1681: Target Version/s: 0.6.0 (was: 0.5.0) > TestNodeReportHandler failing because of NPE > > > Key: HDDS-1681 > URL: https://issues.apache.org/jira/browse/HDDS-1681 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > > {code} > [INFO] Running org.apache.hadoop.hdds.scm.node.TestNodeReportHandler > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.469 > s <<< FAILURE! - in org.apache.hadoop.hdds.scm.node.TestNodeReportHandler > [ERROR] testNodeReport(org.apache.hadoop.hdds.scm.node.TestNodeReportHandler) > Time elapsed: 0.31 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:122) > at > org.apache.hadoop.hdds.scm.node.TestNodeReportHandler.resetEventCollector(TestNodeReportHandler.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1617) Restructure the code layout for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1617: Labels: Triaged backlog (was: Triaged) > Restructure the code layout for Ozone Manager > - > > Key: HDDS-1617 > URL: https://issues.apache.org/jira/browse/HDDS-1617 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Major > Labels: Triaged, backlog > Time Spent: 2.5h > Remaining Estimate: 0h > > The Ozone Manager has a flat structure that deals with lot of specific > functions. This Jira proposes to refactor ozone managers code base and move > function specific packages. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1427) Differentiate log messages by service instance in test output
[ https://issues.apache.org/jira/browse/HDDS-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1427: Target Version/s: 0.7.0 (was: 0.5.0) > Differentiate log messages by service instance in test output > - > > Key: HDDS-1427 > URL: https://issues.apache.org/jira/browse/HDDS-1427 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal >Priority: Major > > When running tests, the log output from multiple services is interleaved. > This makes it very hard to follow the sequence of events. > This is especially seen with MiniOzoneChaosCluster which starts 20 DataNodes > in the same process. > One way we can do this is by using [Log4j > NDC|https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/NDC.html] > or [slf4j MDC|https://www.slf4j.org/api/org/slf4j/MDC.html] to print the PID > and thread name/thread ID. It probably won't be a simple change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1361) Add new failure modes to MiniOzoneChaosCluster
[ https://issues.apache.org/jira/browse/HDDS-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-1361: Target Version/s: 0.7.0 (was: 0.5.0) > Add new failure modes to MiniOzoneChaosCluster > -- > > Key: HDDS-1361 > URL: https://issues.apache.org/jira/browse/HDDS-1361 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Priority: Major > > This jira proposes to add new failure modes to MiniOzoneChaosCluster, this > framework is used to simulate failure conditions in OzoneCluster by > simulating number of failure conditions. > 1) Pipeline destroy > 2) Ratis group directory removal -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-851) Provide official apache docker image for Ozone
[ https://issues.apache.org/jira/browse/HDDS-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-851: --- Target Version/s: 0.7.0 > Provide official apache docker image for Ozone > -- > > Key: HDDS-851 > URL: https://issues.apache.org/jira/browse/HDDS-851 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: Triaged > Attachments: docker-ozone-latest.tar.gz, ozonedocker.png > > > Similar to the apache/hadoop:2 and apache/hadoop:3 images I propose to > provide apache/ozone docker images which includes the voted release binaries. > The image can follow all the conventions from HADOOP-14898 > 1. BRANCHING > I propose to create new docker branches: > docker-ozone-0.3.0-alpha > docker-ozone-latest > And ask INFRA to register docker-ozone-(.*) in the dockerhub to create > apache/ozone: images > 2. RUNNING > I propose to create a default runner script which starts om + scm + datanode > + s3g all together. With this approach you can start a full ozone cluster as > easy as > {code} > docker run -p 9878:9878 -p 9876:9876 -p 9874:9874 -d apache/ozone > {code} > That's all. This is an all-in-one docker image which is ready to try out. > 3. RUNNING with compose > I propose to include a default docker-compose + config file in the image. To > start a multi-node pseudo cluster it will be enough to execute: > {code} > docker run apache/ozone cat docker-compose.yaml > docker-compose.yaml > docker run apache/ozone cat docker-config > docker-config > docker-compose up -d > {code} > That's all, and you have a multi-(pseudo)node ozone cluster which could be > scaled up and down with ozone. > 4. k8s > Later we can also provide k8s resource files with the same approach: > {code} > docker run apache/ozone cat k8s.yaml | kubectl apply -f - > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-3676) Display datanode uuid into the printTopology command output
[ https://issues.apache.org/jira/browse/HDDS-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao resolved HDDS-3676. -- Fix Version/s: 0.6.0 Resolution: Fixed Thanks [~maobaolong] for the contribution and all for the reviews. PR has been merged. > Display datanode uuid into the printTopology command output > --- > > Key: HDDS-3676 > URL: https://issues.apache.org/jira/browse/HDDS-3676 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone CLI >Affects Versions: 0.6.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Labels: Triaged, pull-request-available > Fix For: 0.6.0 > > > Sorry to the previous changes HDDS-3606, it is useful, but i have to do some > other efforts to distinguish the datanodes in same node, display the uuid of > each datanode can make more sense. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] xiaoyuyao merged pull request #981: HDDS-3676. Display datanode uuid into the printTopology command output
xiaoyuyao merged pull request #981: URL: https://github.com/apache/hadoop-ozone/pull/981 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-750) Write Security audit entry to track activities related to Private Keys and certificates
[ https://issues.apache.org/jira/browse/HDDS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDDS-750: --- Target Version/s: (was: 0.5.0) Labels: backlog (was: ) > Write Security audit entry to track activities related to Private Keys and > certificates > > > Key: HDDS-750 > URL: https://issues.apache.org/jira/browse/HDDS-750 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Dinesh Chitlangia >Priority: Major > Labels: backlog > > Write Security Audit entry to track security tasks performed on SCM, OM and > DN. > Tasks: > * Private Keys: bootstrap/rotation > * Certificates: CSR submission, rotation -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org