date:20200901

[jira] [Updated] (HDDS-4189) Use a unified Cli syntax for both getting OM and SCM status

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4189:
-
Labels: pull-request-available  (was: )

> Use a unified Cli syntax for both getting OM and SCM status
> ---
>
> Key: HDDS-4189
> URL: https://issues.apache.org/jira/browse/HDDS-4189
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>
> https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 
> suggests a unification for OM and SCM for getting status by Cli.
> https://github.com/apache/hadoop-ozone/pull/1346 updated for SCM case.
> This JIRA proposes to change 
> ozone admin om getserviceroles
> to 
> ozone admin om status 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] cxorm edited a comment on pull request #1233: HDDS-3725. Ozone sh volume client support quota option.

2020-09-01 Thread GitBox



cxorm edited a comment on pull request #1233:
URL: https://github.com/apache/hadoop-ozone/pull/1233#issuecomment-685179614


   Thank you @captainzmc for updating the PR. 
   Could you rebase it with latest master branch ?
   
   This PR looks good to me, +1 (Would you be so kind to take a look on this 
PR, @ChenSammi )



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia opened a new pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia opened a new pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375


   
   ## What changes were proposed in this pull request?
   
   Change `ozone admin om getserviceroles` to `ozone admin om status`
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4189
   
   ## How was this patch tested?
   
   Unit Test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



adoroszlai commented on a change in pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481521954



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java
##
@@ -30,10 +30,10 @@
 import java.util.concurrent.Callable;
 
 /**
- * Handler of om get-service-roles command.
+ * Handler of om status command.
  */
 @CommandLine.Command(
-name = "getserviceroles",
+name = "status",

Review comment:
   I think we should keep `getserviceroles` as an alias for compatibility.
   
   ```suggestion
   name = "status", aliases = "getserviceroles",
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower

2020-09-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai reassigned HDDS-3762:
--

Assignee: Attila Doroszlai

> Intermittent failure in TestDeleteWithSlowFollower
> --
>
> Key: HDDS-3762
> URL: https://issues.apache.org/jira/browse/HDDS-3762
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 1.0.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>
> TestDeleteWithSlowFollower failed soon after it was re-enabled in HDDS-3330.
> {code:title=https://github.com/apache/hadoop-ozone/runs/753363338}
> [INFO] Running org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.647 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] 
> testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower)
>   Time elapsed: 0.163 s  <<< FAILURE!
> java.lang.AssertionError
>   ...
>   at org.junit.Assert.assertNotNull(Assert.java:631)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
> {code}
> CC [~shashikant] [~elek]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] maobaolong commented on pull request #1369: HDDS-4104. Provide a way to get the default value and key of java-based-configuration easily

2020-09-01 Thread GitBox



maobaolong commented on pull request #1369:
URL: https://github.com/apache/hadoop-ozone/pull/1369#issuecomment-684589260


   @elek Thanks to create the Java-based-Configuration framework, it is better 
than the hadoop configuration, but sometime, I want to get the default value or 
key name for a config variable, so I submit this PR, hope this can supply a way 
to use Java-based-Configuration framework conveniently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] flirmnave commented on pull request #1368: HDDS-4156. add hierarchical layout to Chinese doc

2020-09-01 Thread GitBox



flirmnave commented on pull request #1368:
URL: https://github.com/apache/hadoop-ozone/pull/1368#issuecomment-684447642


   > Thanks for the work @flirmnave .
   > 
   > According to what I see in localhost, it seems there are some subpages 
left.
   > 
   > 
![](https://camo.githubusercontent.com/0f4b4bf694439930b0ac7d53e0cb5247ee5b6405/68747470733a2f2f747661312e73696e61696d672e636e2f6c617267652f30303753385a496c6c7931676961777364656565786a3330673231316f7135752e6a7067)
   > 
   > Also, as for `Features` menu, I think it is both okay to finish it in a 
separate PR or in this one.
   
   
   Thanks @iamabug review and comment.
   Some of the documents have not been translated into Chinese
   So some subpages left.
   
   For `Features` menu, I have create a Jira issue.
   https://issues.apache.org/jira/browse/HDDS-4184



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1370:
URL: https://github.com/apache/hadoop-ozone/pull/1370#discussion_r480881190



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java
##
@@ -297,9 +298,11 @@ public void test2WayCommitForTimeoutException() throws 
Exception {
 xceiverClient.getPipeline()));
 reply.getResponse().get();
 Assert.assertEquals(3, ratisClient.getCommitInfoMap().size());
+List datanodeDetails = pipeline.getNodes();
 for (HddsDatanodeService dn : cluster.getHddsDatanodes()) {
   // shutdown the ratis follower
-  if (ContainerTestHelper.isRatisFollower(dn, pipeline)) {
+  if (datanodeDetails.contains(dn.getDatanodeDetails())
+  && ContainerTestHelper.isRatisFollower(dn, pipeline)) {

Review comment:
   Oops. You are right. I was looking at a wrong file.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4104) Provide a way to get the default value and key of java-based-configuration easily

2020-09-01 Thread maobaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong reassigned HDDS-4104:


Assignee: maobaolong

> Provide a way to get the default value and key of java-based-configuration 
> easily
> -
>
> Key: HDDS-4104
> URL: https://issues.apache.org/jira/browse/HDDS-4104
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Minor
>  Labels: pull-request-available
>
> - getDefaultValue
> - getKeyName



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4104) Provide a way to get the default value and key of java-based-configuration easily

2020-09-01 Thread maobaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-4104:
-
Issue Type: New Feature  (was: Improvement)

> Provide a way to get the default value and key of java-based-configuration 
> easily
> -
>
> Key: HDDS-4104
> URL: https://issues.apache.org/jira/browse/HDDS-4104
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Minor
>  Labels: pull-request-available
>
> - getDefaultValue
> - getKeyName



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1329: HDDS-4077. Incomplete OzoneFileSystem statistics

2020-09-01 Thread GitBox



adoroszlai commented on pull request #1329:
URL: https://github.com/apache/hadoop-ozone/pull/1329#issuecomment-684631539


   Thanks @bshashikant for reviewing and merging this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4184) Add Features menu for Chinese document.

2020-09-01 Thread Zheng Huang-Mu (Jira)

Zheng Huang-Mu created HDDS-4184:


 Summary: Add Features menu for Chinese document.
 Key: HDDS-4184
 URL: https://issues.apache.org/jira/browse/HDDS-4184
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
 Environment:  !image-2020-09-01-14-24-44-703.png! 
Reporter: Zheng Huang-Mu
 Attachments: image-2020-09-01-14-24-44-703.png

In English document, there is a *Features* menu, and *GDPR* is *Feature's* 
submenu.
 So we can add *Features* menu and change *GDPR* to *Features* submenu in 
Chinese document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4184) Add Features menu for Chinese document.

2020-09-01 Thread Zheng Huang-Mu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Huang-Mu updated HDDS-4184:
-
Environment: (was: !image-2020-09-01-14-24-44-703.png!)

> Add Features menu for Chinese document.
> ---
>
> Key: HDDS-4184
> URL: https://issues.apache.org/jira/browse/HDDS-4184
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Zheng Huang-Mu
>Priority: Minor
> Attachments: image-2020-09-01-14-24-44-703.png
>
>
> In English document, there is a *Features* menu, and *GDPR* is *Feature's* 
> submenu.
>  So we can add *Features* menu and change *GDPR* to *Features* submenu in 
> Chinese document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4184) Add Features menu for Chinese document.

2020-09-01 Thread Zheng Huang-Mu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Huang-Mu updated HDDS-4184:
-
Environment: !image-2020-09-01-14-24-44-703.png!  (was:  
!image-2020-09-01-14-24-44-703.png! )

> Add Features menu for Chinese document.
> ---
>
> Key: HDDS-4184
> URL: https://issues.apache.org/jira/browse/HDDS-4184
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
> Environment: !image-2020-09-01-14-24-44-703.png!
>Reporter: Zheng Huang-Mu
>Priority: Minor
> Attachments: image-2020-09-01-14-24-44-703.png
>
>
> In English document, there is a *Features* menu, and *GDPR* is *Feature's* 
> submenu.
>  So we can add *Features* menu and change *GDPR* to *Features* submenu in 
> Chinese document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4185) Remove IncrementalByteBuffer from Ozone client

2020-09-01 Thread Marton Elek (Jira)

Marton Elek created HDDS-4185:
-

 Summary: Remove IncrementalByteBuffer from Ozone client
 Key: HDDS-4185
 URL: https://issues.apache.org/jira/browse/HDDS-4185
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


During the teragen test it was identified that the IncrementalByteBuffer is one 
of the biggest bottlenecks. 

In the PR of HDDS-4119 a long conversation has been started if it can be 
removed or we need other solution to optimize.

This jira is opened to continue the discussion and either remove or optimize 
the IncrementalByteByffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1369: HDDS-4104. Provide a way to get the default value and key of java-based-configuration easily

2020-09-01 Thread GitBox



adoroszlai commented on a change in pull request #1369:
URL: https://github.com/apache/hadoop-ozone/pull/1369#discussion_r480984328



##
File path: 
hadoop-hdds/config/src/test/java/org/apache/hadoop/hdds/conf/TestConfigurationReflectionUtil.java
##
@@ -0,0 +1,71 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.conf;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Optional;
+
+/**
+ * Test the configuration reflection utility class.
+ */
+public class TestConfigurationReflectionUtil {
+
+  @Test
+  public void testClassWithConfigGroup() {
+Optional actualType =
+ConfigurationReflectionUtil.getType(
+ConfigurationExample.class, "waitTime");
+Assert.assertTrue(actualType.isPresent());
+Assert.assertEquals(ConfigType.TIME, actualType.get());
+
+Optional actualKey =
+ConfigurationReflectionUtil.getKey(
+ConfigurationExample.class, "waitTime");
+Assert.assertTrue(actualKey.isPresent());
+Assert.assertEquals("ozone.scm.client.wait", actualKey.get());
+
+Optional actualDefaultValue =
+ConfigurationReflectionUtil.getDefaultValue(
+ConfigurationExample.class, "waitTime");
+Assert.assertTrue(actualKey.isPresent());
+Assert.assertEquals("30m", actualDefaultValue.get());
+  }
+
+  @Test
+  public void testClassWithoutConfigGroup() {
+Optional actualType =
+ConfigurationReflectionUtil.getType(
+ConfigurationExampleGrandParent.class, "number");
+Assert.assertTrue(actualType.isPresent());
+Assert.assertEquals(ConfigType.AUTO, actualType.get());
+
+Optional actualKey =
+ConfigurationReflectionUtil.getKey(
+ConfigurationExampleGrandParent.class, "number");
+Assert.assertTrue(actualKey.isPresent());
+Assert.assertEquals("number", actualKey.get());
+
+Optional actualDefaultValue =
+ConfigurationReflectionUtil.getDefaultValue(
+ConfigurationExampleGrandParent.class, "number");
+Assert.assertTrue(actualKey.isPresent());

Review comment:
   ```suggestion
   Assert.assertTrue(actualDefaultValue.isPresent());
   ```

##
File path: 
hadoop-hdds/config/src/main/java/org/apache/hadoop/hdds/conf/ConfigurationReflectionUtil.java
##
@@ -240,4 +242,52 @@ private static ConfigType detectConfigType(Class 
parameterType,
   }
 }
   }
+
+  public static Optional getDefaultValue(Class configClass,
+  String fieldName) {
+Config annotation = findFieldConfigAnnotationByName(configClass,
+fieldName);
+if (annotation != null) {
+  return Optional.of(annotation.defaultValue());
+}
+return Optional.empty();
+  }
+
+  public static Optional getKey(Class configClass,
+  String fieldName) {
+ConfigGroup configGroup =
+configClass.getAnnotation(ConfigGroup.class);
+
+Config annotation = findFieldConfigAnnotationByName(configClass,
+fieldName);
+if (annotation != null) {
+  String key = annotation.key();
+  if (configGroup != null) {
+key = configGroup.prefix() + "." + annotation.key();
+  }
+  return Optional.of(key);
+}
+return Optional.empty();
+  }
+
+  public static Optional getType(Class configClass,
+  String fieldName) {
+Config config = findFieldConfigAnnotationByName(configClass,
+fieldName);
+if (config != null) {
+  return Optional.of(config.type());
+}
+return Optional.empty();
+  }
+
+  private static Config findFieldConfigAnnotationByName(Class configClass,
+  String fieldName) {
+Optional field = Stream.of(configClass.getDeclaredFields())
+.filter(f -> f.getName().equals(fieldName))
+.findFirst();
+if (field.isPresent()) {
+  return field.get().getAnnotation(Config.class);
+}
+return null;

Review comment:
   `findFieldConfigAnnotationByName` could return `Optional` 
instead of `Config` or `null`:
   
   ```java
   .findFirst()
   .map(field -> field.getAnnotation(Config.class));
   ```
   
   This would let the other new methods (`getType`, etc.) to be simplified to:
   
   ```java
   return

[jira] [Updated] (HDDS-4168) Remove reference to Skaffold in the README in dist/

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4168:
-
Labels: pull-request-available  (was: )

> Remove reference to Skaffold in the README in dist/
> ---
>
> Key: HDDS-4168
> URL: https://issues.apache.org/jira/browse/HDDS-4168
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Alex Scammon
>Assignee: Alex Scammon
>Priority: Trivial
>  Labels: pull-request-available
>
> I got sidetracked when I ran into the README in the dist folder because it 
> referenced Skaffold which hasn't been used in a while.
> So that others don't get confused as I did,  I created a PR  to fix the 
> wayward reference:
>  * [https://github.com/apache/hadoop-ozone/pull/1360]
> This issue is merely to track the PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3457) When ACL enable, use ozonefs put key will get OMException: Key not found, checkAccess failed.

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3457:
-
Labels: pull-request-available  (was: )

> When ACL enable, use ozonefs put key will get OMException: Key not found, 
> checkAccess failed. 
> --
>
> Key: HDDS-3457
> URL: https://issues.apache.org/jira/browse/HDDS-3457
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem, Ozone Manager
>Reporter: mingchao zhao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: image-2020-04-20-15-37-18-655.png
>
>
> When enable acl, I get OMException: Key not found, checkAccess failed，when I 
> use ozonefs put file.
> !image-2020-04-20-15-37-18-655.png|width=820,height=47!
> When the acl is enabled, I find that the background needs to checkaccess to 
> check for acl permissions. This will get 
> [FileStatus|https://github.com/apache/hadoop-ozone/blob/56def9f0b8c89588a8008e21e299047e3cbeb37a/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1604]
>  first, and when the key does Not exist, the "Key Not Found exception" will 
> appear.
> As currently implemented, we would never be able to put file through ozoneFS 
> after the enable acl. Unless the file already exists, we override it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM

2020-09-01 Thread GitBox



sodonnel commented on pull request #1339:
URL: https://github.com/apache/hadoop-ozone/pull/1339#issuecomment-684505875


   I think this change is good to commit now? @adoroszlai gave a thumbs up a 
few days back and I have addressed the only concern @ChenSammi raised.
   
   I will commit tomorrow unless anyone objects before then.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade

2020-09-01 Thread Marton Elek (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188244#comment-17188244
 ] 

Marton Elek commented on HDDS-4181:
---

Related: HDDS-3855 which shows an initial attempt.

> Add acceptance tests for upgrade, finalization and downgrade
> 
>
> Key: HDDS-4181
> URL: https://issues.apache.org/jira/browse/HDDS-4181
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Priority: Major
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade

2020-09-01 Thread Marton Elek (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188244#comment-17188244
 ] 

Marton Elek edited comment on HDDS-4181 at 9/1/20, 8:10 AM:


Related: HDDS-3855 which shows an initial version of upgrade acceptance test.


was (Author: elek):
Related: HDDS-3855 which shows an initial attempt.

> Add acceptance tests for upgrade, finalization and downgrade
> 
>
> Key: HDDS-4181
> URL: https://issues.apache.org/jira/browse/HDDS-4181
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Priority: Major
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4168) Remove reference to Skaffold in the README in dist/

2020-09-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai reassigned HDDS-4168:
--

Assignee: Alex Scammon

> Remove reference to Skaffold in the README in dist/
> ---
>
> Key: HDDS-4168
> URL: https://issues.apache.org/jira/browse/HDDS-4168
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Alex Scammon
>Assignee: Alex Scammon
>Priority: Trivial
>
> I got sidetracked when I ran into the README in the dist folder because it 
> referenced Skaffold which hasn't been used in a while.
> So that others don't get confused as I did,  I created a PR  to fix the 
> wayward reference:
>  * [https://github.com/apache/hadoop-ozone/pull/1360]
> This issue is merely to track the PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] iamabug commented on a change in pull request #1368: HDDS-4156. add hierarchical layout to Chinese doc

2020-09-01 Thread GitBox



iamabug commented on a change in pull request #1368:
URL: https://github.com/apache/hadoop-ozone/pull/1368#discussion_r480992755



##
File path: hadoop-hdds/docs/content/concept/_index.zh.md
##
@@ -2,7 +2,7 @@
 title: 概念
 date: "2017-10-10"
 menu: main
-weight: 6
+weight: 3

Review comment:
   No, it is the right thing to do, sorry for not noticing this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1360: HDDS-4168. Remove archaic reference to Skaffold

2020-09-01 Thread GitBox



adoroszlai commented on pull request #1360:
URL: https://github.com/apache/hadoop-ozone/pull/1360#issuecomment-684638663


   @elek would you like to take a look at this doc improvement?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481469131



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
##
@@ -230,4 +230,5 @@ Pipeline 
createReplicationPipeline(HddsProtos.ReplicationType type,
*/
   boolean getReplicationManagerStatus() throws IOException;
 
+  List getScmRatisStatus() throws IOException;

Review comment:
   Pushed a new commit to merge this logic into `GetScmInfo`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] cxorm commented on pull request #1233: HDDS-3725. Ozone sh volume client support quota option.

2020-09-01 Thread GitBox



cxorm commented on pull request #1233:
URL: https://github.com/apache/hadoop-ozone/pull/1233#issuecomment-685179614


   Thank you @captainzmc for updating the PR. 
   Could you rebase it with latest master branch ?
   
   This PR looks good to me, +1 (cc @ChenSammi )



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai merged pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread GitBox



adoroszlai merged pull request #1370:
URL: https://github.com/apache/hadoop-ozone/pull/1370


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread GitBox



adoroszlai commented on pull request #1370:
URL: https://github.com/apache/hadoop-ozone/pull/1370#issuecomment-685243200


   Thanks @runzhiwang for the fix and @amaliujia for the review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-4176:
---
Component/s: test

> Fix failed UT: test2WayCommitForTimeoutException
> 
>
> Key: HDDS-4176
> URL: https://issues.apache.org/jira/browse/HDDS-4176
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Fix For: 1.1.0
>
>
> org.apache.ratis.protocol.GroupMismatchException: 
> 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608)
> at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



timmylicheng commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481557217



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   ozone admin om getserviceroles -id=<>
   
   This is what OM does. I have my +1 on ozone admin (om|scm) roles. Status is 
more like health check.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-4176:
---
Labels:   (was: pull-request-available)

> Fix failed UT: test2WayCommitForTimeoutException
> 
>
> Key: HDDS-4176
> URL: https://issues.apache.org/jira/browse/HDDS-4176
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Fix For: 1.1.0
>
>
> org.apache.ratis.protocol.GroupMismatchException: 
> 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608)
> at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai resolved HDDS-4176.

Fix Version/s: 1.1.0
   Resolution: Fixed

> Fix failed UT: test2WayCommitForTimeoutException
> 
>
> Key: HDDS-4176
> URL: https://issues.apache.org/jira/browse/HDDS-4176
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> org.apache.ratis.protocol.GroupMismatchException: 
> 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608)
> at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai opened a new pull request #1376: HDDS-3762. Intermittent failure in TestDeleteWithSlowFollower

2020-09-01 Thread GitBox



adoroszlai opened a new pull request #1376:
URL: https://github.com/apache/hadoop-ozone/pull/1376


   ## What changes were proposed in this pull request?
   
   Intermittent failure in `testDeleteKeyWithSlowFollower` seems to be caused 
by:
   
   * `DeleteBlocksCommandHandler` increments `invocationCount` near the 
beginning of `handle()`, and only updates `deleteTransactionId` later
   * `TestDeleteWithSlowFollower` waits for `invocationCount >= 1`, then 
asserts `deleteTransactionId` also increased
   
   The test is fixed by changing the order in the handler: only increment 
`invocationCount` at the end, when `deleteTransactionId` is already updated.
   
   I think `invocationCount` should be updated together with `totalTime` to 
provide (slightly more) correct average run time (`getAverageRunTime`).
   
   https://issues.apache.org/jira/browse/HDDS-3762
   
   ## How was this patch tested?
   
   Passed 50x:
   https://github.com/adoroszlai/hadoop-ozone/runs/1059570465#step:5:4
   
   Regular CI:
   https://github.com/adoroszlai/hadoop-ozone/runs/1059560880



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3762:
-
Labels: pull-request-available  (was: )

> Intermittent failure in TestDeleteWithSlowFollower
> --
>
> Key: HDDS-3762
> URL: https://issues.apache.org/jira/browse/HDDS-3762
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 1.0.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> TestDeleteWithSlowFollower failed soon after it was re-enabled in HDDS-3330.
> {code:title=https://github.com/apache/hadoop-ozone/runs/753363338}
> [INFO] Running org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.647 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] 
> testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower)
>   Time elapsed: 0.163 s  <<< FAILURE!
> java.lang.AssertionError
>   ...
>   at org.junit.Assert.assertNotNull(Assert.java:631)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
> {code}
> CC [~shashikant] [~elek]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4189) Use a unified Cli syntax for both getting OM and SCM status

2020-09-01 Thread Rui Wang (Jira)

Rui Wang created HDDS-4189:
--

 Summary: Use a unified Cli syntax for both getting OM and SCM 
status
 Key: HDDS-4189
 URL: https://issues.apache.org/jira/browse/HDDS-4189
 Project: Hadoop Distributed Data Store
  Issue Type: Task
Reporter: Rui Wang
Assignee: Rui Wang


https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 suggests 
a unification for OM and SCM for getting status by Cli.

https://github.com/apache/hadoop-ozone/pull/1346 updated for SCM case.

This JIRA proposes to change 

ozone admin om getserviceroles
to 
ozone admin om status 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481469412



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/shell/TestScmAdminHA.java
##
@@ -0,0 +1,66 @@
+package org.apache.hadoop.ozone.shell;

Review comment:
   ack





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia commented on pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685181181


   R: @adoroszlai @timmylicheng 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4176:
-
Labels: pull-request-available  (was: )

> Fix failed UT: test2WayCommitForTimeoutException
> 
>
> Key: HDDS-4176
> URL: https://issues.apache.org/jira/browse/HDDS-4176
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> org.apache.ratis.protocol.GroupMismatchException: 
> 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found.
> at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274)
> at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608)
> at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] vivekratnavel commented on pull request #1364: HDDS-4165. GitHub Actions cache does not work outside of workspace

2020-09-01 Thread GitBox



vivekratnavel commented on pull request #1364:
URL: https://github.com/apache/hadoop-ozone/pull/1364#issuecomment-685272723


   @adoroszlai Thanks for fixing this! +1 LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia commented on pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685211998


   Thank you for your review @cxorm!
   
   ```
   [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
4.417 s <<< FAILURE! - in 
org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest
   [ERROR] 
testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest)
  Time elapsed: 0.114 s  <<< FAILURE!
   java.lang.AssertionError: Values should be different. Actual: 1599003025036
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failEquals(Assert.java:185)
at org.junit.Assert.assertNotEquals(Assert.java:161)
at org.junit.Assert.assertNotEquals(Assert.java:198)
at org.junit.Assert.assertNotEquals(Assert.java:209)
at 
org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
   
   ```
   
   The failed UT seems not related to this change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4190) Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache

2020-09-01 Thread Attila Doroszlai (Jira)

Attila Doroszlai created HDDS-4190:
--

 Summary: Intermittent failure in 
TestOMAllocateBlockRequest#testValidateAndUpdateCache
 Key: HDDS-4190
 URL: https://issues.apache.org/jira/browse/HDDS-4190
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Reporter: Attila Doroszlai


{code:title=https://github.com/elek/ozone-build-results/blob/master/2020/09/01/2686/unit/hadoop-ozone/ozone-manager/org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.txt}
Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.214 s <<< 
FAILURE! - in org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest
testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest)
  Time elapsed: 0.129 s  <<< FAILURE!
java.lang.AssertionError: Values should be different. Actual: 1598964934681
  ...
  at org.junit.Assert.assertNotEquals(Assert.java:209)
  at 
org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] cxorm commented on pull request #1367: HDDS-4169. Fix some minor errors in StorageContainerManager.md

2020-09-01 Thread GitBox



cxorm commented on pull request #1367:
URL: https://github.com/apache/hadoop-ozone/pull/1367#issuecomment-685254966


   The failed `build-branch / unit (pull_request)` seems not related the patch.
   Let's commit it again. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1340: HDDS-3188 Enable SCM group with failover proxy for SCM block location.

2020-09-01 Thread GitBox



amaliujia commented on pull request #1340:
URL: https://github.com/apache/hadoop-ozone/pull/1340#issuecomment-685271344


   This PR overall LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#issuecomment-685173208


   Addressed the following comments
   1. Merge `getRatisStatus` with `GetScmInfo`
   2. Adopt command syntax `ozone admin scm status`
   3. added an acceptance test
   
   @timmylicheng I am not sure how to test an acceptance test. Can you share a 
way to run it locally?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1371: HDDS-2922. Balance ratis leader distribution in datanodes

2020-09-01 Thread GitBox



amaliujia commented on pull request #1371:
URL: https://github.com/apache/hadoop-ozone/pull/1371#issuecomment-685181887


   Thanks @runzhiwang. This is an awesome work! I will also try to help review 
this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] cxorm commented on pull request #1075: HDDS-3369. Cleanup old write-path of volume in OM

2020-09-01 Thread GitBox



cxorm commented on pull request #1075:
URL: https://github.com/apache/hadoop-ozone/pull/1075#issuecomment-685191223


   Thank you @adoroszlai for the advise.
   
   Splitting `OzoneManagerProtocol` interface is huge work IMHO.
   
   I think it is improper to change the interface after GA-release,  
   so I propose close this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower

2020-09-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188941#comment-17188941
 ] 

Attila Doroszlai commented on HDDS-3762:


I think the bug from description was fixed by HDDS-3964: committed on Jul 17, 
no new failures since Jul 14:

{noformat}
2020/06/09/997/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
   at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/06/16/1047/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/06/22/1077/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/06/23/1122/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/06/25/1172/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/06/26/1215/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
2020/07/02/1386/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:216)
2020/07/14/1658/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:216)
{noformat}

However, let me hijack this bug for another intermittent failure of 
TestDeleteWithSlowFollower:

{noformat}
2020/06/29/1299/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:279)
2020/07/25/2012/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
2020/08/02/2191/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
2020/08/03/2211/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
2020/08/03/2214/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
2020/08/16/2452/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
2020/08/28/2646/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt:
  at 
org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272)
{noformat}

that is:

{code:title=https://github.com/apache/hadoop-ozone/blob/9cef3f63384d643ca8d25ea70d87f5415f92bc88/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java#L272}
Assert.assertTrue(containerData.getDeleteTransactionId() > delTrxId);
{code}

> Intermittent failure in TestDeleteWithSlowFollower
> --
>
> Key: HDDS-3762
> URL: https://issues.apache.org/jira/browse/HDDS-3762
> Project: Hadoop Distributed Data Store
>

[GitHub] [hadoop-ozone] cxorm commented on a change in pull request #1233: HDDS-3725. Ozone sh volume client support quota option.

2020-09-01 Thread GitBox



cxorm commented on a change in pull request #1233:
URL: https://github.com/apache/hadoop-ozone/pull/1233#discussion_r481477852



##
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/protocol/ClientProtocol.java
##
@@ -101,10 +100,11 @@ void createVolume(String volumeName, VolumeArgs args)
   /**
* Set Volume Quota.
* @param volumeName Name of the Volume
-   * @param quota Quota to be set for the Volume
+   * @param quotaInBytes The maximum size this volume can be used.
+   * @param quotaInCounts The maximum number of buckets in this volume.
* @throws IOException
*/
-  void setVolumeQuota(String volumeName, OzoneQuota quota)
+  void setVolumeQuota(String volumeName, long quotaInBytes, long quotaInCounts)

Review comment:
   > The previous feature of Ozone on Set quota is virtual, and this 
interface won't have any actual effect. This feature is incomplete, so no one 
will use it before.
   
   I am fine with it. 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #912: WIP Patch - HDDS-2949: store dir/key entries in separate tables - first patch onl…

2020-09-01 Thread GitBox



github-actions[bot] commented on pull request #912:
URL: https://github.com/apache/hadoop-ozone/pull/912#issuecomment-685206902


   Thank you very much for the patch. I am closing this PR __temporarily__ as 
there was no activity recently and it is waiting for response from its author.
   
   It doesn't mean that this PR is not important or ignored: feel free to 
reopen the PR at any time.
   
   It only means that attention of committers is not required. We prefer to 
keep the review queue clean. This ensures PRs in need of review are more 
visible, which results in faster feedback for all PRs.
   
   If you need ANY help to finish this PR, please [contact the 
community](https://github.com/apache/hadoop-ozone#contact) on the mailing list 
or the slack channel."



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #1173: HDDS-3880. Improve OM HA Robot test

2020-09-01 Thread GitBox



github-actions[bot] closed pull request #1173:
URL: https://github.com/apache/hadoop-ozone/pull/1173


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #1173: HDDS-3880. Improve OM HA Robot test

2020-09-01 Thread GitBox



github-actions[bot] commented on pull request #1173:
URL: https://github.com/apache/hadoop-ozone/pull/1173#issuecomment-685206895


   Thank you very much for the patch. I am closing this PR __temporarily__ as 
there was no activity recently and it is waiting for response from its author.
   
   It doesn't mean that this PR is not important or ignored: feel free to 
reopen the PR at any time.
   
   It only means that attention of committers is not required. We prefer to 
keep the review queue clean. This ensures PRs in need of review are more 
visible, which results in faster feedback for all PRs.
   
   If you need ANY help to finish this PR, please [contact the 
community](https://github.com/apache/hadoop-ozone#contact) on the mailing list 
or the slack channel."



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #912: WIP Patch - HDDS-2949: store dir/key entries in separate tables - first patch onl…

2020-09-01 Thread GitBox



github-actions[bot] closed pull request #912:
URL: https://github.com/apache/hadoop-ozone/pull/912


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] captainzmc commented on pull request #868: HDDS-3457. Fix ozonefs put and mkdir KEY_NOT_FOUND issue when ACL enable

2020-09-01 Thread GitBox



captainzmc commented on pull request #868:
URL: https://github.com/apache/hadoop-ozone/pull/868#issuecomment-685244606


   > Hi @captainzmc,
   > I enabled ACL and performed mkdir using o3fs, but 'checkAccess' function 
is never reached. I have put some System.out.println statements in 
'checkAccess' function, but nothing is being printed. Can you please help!
   
   Hi @aryangupta1998 [Is that the 
method](https://github.com/apache/hadoop-ozone/blob/34ee8311b0d0a37878fe1fd2e5d8c1b91aa8cc8f/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1633)?
 I suggest you print the log with LOG.xxx or enable debugging in IDEA.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] cxorm commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



cxorm commented on a change in pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481576976



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java
##
@@ -30,10 +30,10 @@
 import java.util.concurrent.Callable;
 
 /**
- * Handler of om get-service-roles command.
+ * Handler of om status command.
  */
 @CommandLine.Command(
-name = "getserviceroles",
+name = "status",

Review comment:
   Thanks @adoroszlai for the idea, I agree with it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



adoroszlai commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481709338



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   I think it depends on whether you want to keep this command specific to 
roles, or may extend the same command in the future with other status info.  
Probably "roles" is better now, and "status" can be either a separate command 
or another alias later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481414123



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   Thanks. I will adopt `ozone admin scm status` in this PR and I will send 
another PR for `ozone admin om status`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] runzhiwang commented on a change in pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException

2020-09-01 Thread GitBox



runzhiwang commented on a change in pull request #1370:
URL: https://github.com/apache/hadoop-ozone/pull/1370#discussion_r481493152



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java
##
@@ -297,9 +298,11 @@ public void test2WayCommitForTimeoutException() throws 
Exception {
 xceiverClient.getPipeline()));
 reply.getResponse().get();
 Assert.assertEquals(3, ratisClient.getCommitInfoMap().size());
+List datanodeDetails = pipeline.getNodes();

Review comment:
   @adoroszlai Thanks for review. I have updated the patch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4190) Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache

2020-09-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188923#comment-17188923
 ] 

Attila Doroszlai commented on HDDS-4190:


Similar assertions (failing intermittently when run locally):

* 
TestOMVolumeSetQuotaRequest.testValidateAndUpdateCacheSuccess(TestOMVolumeSetQuotaRequest.java:101)
* 
TestOMVolumeSetOwnerRequest.testValidateAndUpdateCacheSuccess(TestOMVolumeSetOwnerRequest.java:100)

> Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache
> -
>
> Key: HDDS-4190
> URL: https://issues.apache.org/jira/browse/HDDS-4190
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>
> {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/09/01/2686/unit/hadoop-ozone/ozone-manager/org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.txt}
> Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.214 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest
> testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest)
>   Time elapsed: 0.129 s  <<< FAILURE!
> java.lang.AssertionError: Values should be different. Actual: 1598964934681
>   ...
>   at org.junit.Assert.assertNotEquals(Assert.java:209)
>   at 
> org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481610070



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java
##
@@ -30,10 +30,10 @@
 import java.util.concurrent.Callable;
 
 /**
- * Handler of om get-service-roles command.
+ * Handler of om status command.
  */
 @CommandLine.Command(
-name = "getserviceroles",
+name = "status",

Review comment:
   That makes sense. Suggestion applied.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3103) Have multi-raft pipeline calculator to recommend best pipeline number per datanode

2020-09-01 Thread Rui Wang (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188980#comment-17188980
 ] 

Rui Wang commented on HDDS-3103:


[~timmylicheng]

Do you have a plan to work on this JIRA in near term? Could I take this one?

> Have multi-raft pipeline calculator to recommend best pipeline number per 
> datanode
> --
>
> Key: HDDS-3103
> URL: https://issues.apache.org/jira/browse/HDDS-3103
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.5.0
>Reporter: Li Cheng
>Assignee: Li Cheng
>Priority: Critical
>
> PipelinePlacementPolicy should have a calculator method to recommend better 
> number for pipeline number per node. The number used to come from 
> ozone.datanode.pipeline.limit in config. SCM should be able to consider how 
> many ratis dir and the ratis retry timeout to recommend the best pipeline 
> number for every node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481746270



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   that makes sense. Indeed `status` sounds more like a health check and 
can carry much more information.
   
   Consider `OM` is also actually getting `roles`. We can start from `roles`

##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   that makes sense. Indeed `status` sounds more like a health check and 
can carry much more information.
   
   Consider `OM` is also actually getting, essentially, `roles`. We can start 
from `roles`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia commented on pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685183677


   Context: 
https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`

2020-09-01 Thread GitBox



amaliujia commented on pull request #1375:
URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685278548


   It seemed that there is another opinion about what command name should be. 
So I will convert this PR to a draft now to wait for a consensus. 
   
   Thanks for all reviews so far and sorry for the confusion.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



amaliujia commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481638257



##
File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java
##
@@ -0,0 +1,26 @@
+package org.apache.hadoop.ozone.admin.scm;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.scm.client.ScmClient;
+import picocli.CommandLine;
+
+@CommandLine.Command(
+name = "listratisstatus",

Review comment:
   I am ok with both. 
   
   @adoroszlai  what do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Description: 
*The problem is:*

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();
  ...{code}
 

*The solution is:*

Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
adjust RetryPolicy so that won't retry for longer that 1min. 

 

*The change has no side effect:*

1) VersionEndpointTask.call() is fine

2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
pipelineReports from OzoneContainer, which is fine.

3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.

 

  was:
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

*The problem is:*

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {

[GitHub] [hadoop-ozone] GlenGeng opened a new pull request #1373: HDDS-4186: Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread GitBox



GlenGeng opened a new pull request #1373:
URL: https://github.com/apache/hadoop-ozone/pull/1373


   ## What changes were proposed in this pull request?
   
   Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
   ```
   RetryPolicy retryPolicy =
RetryPolicies.retryForeverWithFixedSleep(
1000, TimeUnit.MILLISECONDS);
   
   StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
   StorageContainerDatanodeProtocolPB.class, version,
   address, UserGroupInformation.getCurrentUser(), hadoopConfig,
   NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
   retryPolicy).getProxy();
   ```
   
   that for Recon is retryUpToMaximumCountWithFixedSleep:
   ```
   RetryPolicy retryPolicy =
   RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
   6, TimeUnit.MILLISECONDS);
   
   ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
   ReconDatanodeProtocolPB.class, version,
   address, UserGroupInformation.getCurrentUser(), hadoopConfig,
   NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
   retryPolicy).getProxy();
   ```

   The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

   When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
   ```
   public EndpointStateMachine.EndPointStates call() throws Exception {
 rpcEndpoint.lock();
   
 try {
   
   
   SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
   .sendHeartbeat(request);
   
   
 } finally {
   rpcEndpoint.unlock();
 }
   
 return rpcEndpoint.getState();
   }
   ```

   **The problem is:**
   If setup one Recon and one SCM, then shutdown the Recon server, all 
Datanodes will be stale/dead very soon at SCM side.

   **The root cause is:**
   The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.
   ```
   public EndpointStateMachine.EndPointStates call() throws Exception {
 rpcEndpoint.lock();
 ...
```
   
   **The solution is:**
   Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we 
may adjust RetryPolicy so that won't retry for longer that 1min. 
   The change has no side effect:
   1) VersionEndpointTask.call() is fine
   2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
pipelineReports from OzoneContainer, which is fine.
   3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4186
   
   ## How was this patch tested?
   
   CI
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4186:
-
Labels: pull-request-available  (was: )

> Adjust RetryPolicy of SCMConnectionManager for SCM/Recon
> 
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>  Labels: pull-request-available
>
> Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
>  RetryPolicies.retryForeverWithFixedSleep(
>  1000, TimeUnit.MILLISECONDS);
> StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> StorageContainerDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();{code}
>  that for Recon is retryUpToMaximumCountWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
> RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
> 6, TimeUnit.MILLISECONDS);
> ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> ReconDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();
> {code}
>  
> The executorService in DatanodeStateMachine is 
> Executors.newFixedThreadPool(...), whose default pool size is 2, one for 
> Recon, another for SCM.
>  
> When encounter rpc failure, call() of RegisterEndpointTask, 
> VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
> rpcEndpoint.lock(). For example:
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   try {
> 
> SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
> .sendHeartbeat(request);
> 
>   } finally {
> rpcEndpoint.unlock();
>   }
>   return rpcEndpoint.getState();
> }
> {code}
>  
> *The problem is:*
> If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
> will be stale/dead very soon at SCM side.
>  
> *The root cause is:*
> The thread running Recon task will retry due to rpc failure, meanwhile holds 
> the lock of EndpointStateMachine for Recon. When DatanodeStateMachine 
> schedule the next round of SCM/Recon task, the only left thread will be 
> assigned to run Recon task, and blocked at waiting for the lock of 
> EndpointStateMachine for Recon.
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   ...{code}
>  
> *The solution is:*
> Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
> adjust RetryPolicy so that won't retry for longer that 1min. 
> The change has no side effect:
> 1) VersionEndpointTask.call() is fine
> 2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
> pipelineReports from OzoneContainer, which is fine.
> 3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Target Version/s:   (was: 0.7.0)

> Adjust RetryPolicy of SCMConnectionManager
> --
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Summary: Adjust RetryPolicy of SCMConnectionManager  (was: CLONE - Improve 
performance of the BufferPool management of Ozone client)

> Adjust RetryPolicy of SCMConnectionManager
> --
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Labels:   (was: pull-request-available)

> Adjust RetryPolicy of SCMConnectionManager
> --
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Summary: Adjust RetryPolicy of SCMConnectionManager for SCM/Recon  (was: 
Adjust RetryPolicy of SCMConnectionManager)

> Adjust RetryPolicy of SCMConnectionManager for SCM/Recon
> 
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>
> Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
>  RetryPolicies.retryForeverWithFixedSleep(
>  1000, TimeUnit.MILLISECONDS);
> StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> StorageContainerDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();{code}
>  that for Recon is retryUpToMaximumCountWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
> RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
> 6, TimeUnit.MILLISECONDS);
> ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> ReconDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();
> {code}
>  
> The executorService in DatanodeStateMachine is 
> Executors.newFixedThreadPool(...), whose default pool size is 2, one for 
> Recon, another for SCM.
>  
> When encounter rpc failure, call() of RegisterEndpointTask, 
> VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
> rpcEndpoint.lock(). For example:
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   try {
> 
> SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
> .sendHeartbeat(request);
> 
>   } finally {
> rpcEndpoint.unlock();
>   }
>   return rpcEndpoint.getState();
> }
> {code}
>  
> *The problem is:*
> If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
> will be stale/dead very soon at SCM side.
>  
> *The root cause is:*
> The thread running Recon task will retry due to rpc failure, meanwhile holds 
> the lock of EndpointStateMachine for Recon. When DatanodeStateMachine 
> schedule the next round of SCM/Recon task, the only left thread will be 
> assigned to run Recon task, and blocked at waiting for the lock of 
> EndpointStateMachine for Recon.
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   ...{code}
>  
> *The solution is:*
> Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
> adjust RetryPolicy so that won't retry for longer that 1min. 
> The change has no side effect:
> 1) VersionEndpointTask.call() is fine
> 2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
> pipelineReports from OzoneContainer, which is fine.
> 3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Component/s: Ozone Datanode

> Adjust RetryPolicy of SCMConnectionManager for SCM/Recon
> 
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>  Labels: pull-request-available
>
> Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
>  RetryPolicies.retryForeverWithFixedSleep(
>  1000, TimeUnit.MILLISECONDS);
> StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> StorageContainerDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();{code}
>  that for Recon is retryUpToMaximumCountWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
> RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
> 6, TimeUnit.MILLISECONDS);
> ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> ReconDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();
> {code}
>  
> The executorService in DatanodeStateMachine is 
> Executors.newFixedThreadPool(...), whose default pool size is 2, one for 
> Recon, another for SCM.
>  
> When encounter rpc failure, call() of RegisterEndpointTask, 
> VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
> rpcEndpoint.lock(). For example:
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   try {
> 
> SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
> .sendHeartbeat(request);
> 
>   } finally {
> rpcEndpoint.unlock();
>   }
>   return rpcEndpoint.getState();
> }
> {code}
>  
> *The problem is:*
> If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
> will be stale/dead very soon at SCM side.
>  
> *The root cause is:*
> The thread running Recon task will retry due to rpc failure, meanwhile holds 
> the lock of EndpointStateMachine for Recon. When DatanodeStateMachine 
> schedule the next round of SCM/Recon task, the only left thread will be 
> assigned to run Recon task, and blocked at waiting for the lock of 
> EndpointStateMachine for Recon.
> {code:java}
> public EndpointStateMachine.EndPointStates call() throws Exception {
>   rpcEndpoint.lock();
>   ...{code}
>  
> *The solution is:*
> Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
> adjust RetryPolicy so that won't retry for longer that 1min. 
> The change has no side effect:
> 1) VersionEndpointTask.call() is fine
> 2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
> pipelineReports from OzoneContainer, which is fine.
> 3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4186) CLONE - Improve performance of the BufferPool management of Ozone client

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng reassigned HDDS-4186:
---

Assignee: Glen Geng  (was: Marton Elek)

> CLONE - Improve performance of the BufferPool management of Ozone client
> 
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4186) CLONE - Improve performance of the BufferPool management of Ozone client

2020-09-01 Thread Glen Geng (Jira)

Glen Geng created HDDS-4186:
---

 Summary: CLONE - Improve performance of the BufferPool management 
of Ozone client
 Key: HDDS-4186
 URL: https://issues.apache.org/jira/browse/HDDS-4186
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Glen Geng
Assignee: Marton Elek


Teragen reported to be slow with low number of mappers compared to HDFS.

In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins but 
with Ozone it was 6 mins. It could be fixed with using more mappers, but when I 
investigated the execution I found a few problems reagrding to the BufferPool 
management.

 1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
itself is incremental
 2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
which can be a slow operation (positions should be calculated).
 3. There is no explicit support for write(byte) operations

In the flamegraph it's clearly visible that with low number of mappers the 
client is busy with buffer operations. After the patch the rpc call and the 
checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1192: HDDS-3947: Sort DNs for client when the key is a file for #getFileStatus #listStatus APIs

2020-09-01 Thread GitBox



adoroszlai commented on a change in pull request #1192:
URL: https://github.com/apache/hadoop-ozone/pull/1192#discussion_r481025587



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/fs/OzoneManagerFS.java
##
@@ -31,7 +31,18 @@
  * Ozone Manager FileSystem interface.
  */
 public interface OzoneManagerFS extends IOzoneAcl {
-  OzoneFileStatus getFileStatus(OmKeyArgs args) throws IOException;
+
+  /**
+   * Get file status for a file or a directory.
+   *
+   * @param args  the args of the key provided by client.
+   * @param clientAddress a hint to key manager, order the datanode in returned
+   *  pipeline by distance between client and datanode.
+   * @return file status.
+   * @throws IOException
+   */
+  OzoneFileStatus getFileStatus(OmKeyArgs args, String clientAddress)

Review comment:
   Should we also keep the two original methods?  Even if backwards 
compatibility is not required for this interface, it would let us avoid 
changing lots of calls just to pass `null`.

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/fs/OzoneManagerFS.java
##
@@ -31,7 +31,18 @@
  * Ozone Manager FileSystem interface.
  */
 public interface OzoneManagerFS extends IOzoneAcl {
-  OzoneFileStatus getFileStatus(OmKeyArgs args) throws IOException;
+
+  /**
+   * Get file status for a file or a directory.
+   *
+   * @param args  the args of the key provided by client.
+   * @param clientAddress a hint to key manager, order the datanode in returned
+   *  pipeline by distance between client and datanode.
+   * @return file status.
+   * @throws IOException

Review comment:
   Thank you for adding updated doc from `KeyManagerImpl` to the interface. 
 Is there a reason exception is not described (the cases when they are thrown)? 
 Also, can you please remove the old javadoc from the implementation?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Description: 
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

*The problem is:*

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();
  ...{code}
 

*The solution is:*

Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
adjust RetryPolicy so that won't retry for longer that 1min. 

The change has no side effect:

1) VersionEndpointTask.call() is fine

2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
pipelineReports from OzoneContainer, which is fine.

3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.

 

  was:
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:

 
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

 

*The problem is:* 

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.

 
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Description: 
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:

 
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

 

*The problem is:* 

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.

 
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();
  ...{code}
 

 

*The solution is:*

Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
adjust RetryPolicy so that won't retry for longer that 1min. 

The change has no side effect:

1) VersionEndpointTask.call() is fine

2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
pipelineReports from OzoneContainer, which is fine.

3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.

 

  was:
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 

for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);
ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
The executorService in DatanodeStateMachine is now 
Executors.newFixedThreadPool(...), whose pool size is 2, one for Recon, another 
for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock().

Here is the problem: if setup one Recon and one SCM, then shutdown the Recon 
server, all Datanodes will be stale/dead very soon. The root cause is that, the 
thread working for Recon will retry while holding the lock of 
EndpointStateMachine for Recon, when DatanodeStateMachine schedule the next 
round of task, the other thread is blocked by waiting for the lock of 
EndpointStateMachine for Recon.

 

Since DatanodeStateMachine will periodically schedule tasks, we may adjust 
RetryPolicy so that the execution of tasks no need to be longer than 1min.

 


> Adjust RetryPolicy of SCMConnectionManager
> --
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>

[jira] [Updated] (HDDS-4119) Improve performance of the BufferPool management of Ozone client

2020-09-01 Thread Marton Elek (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-4119:
--
Target Version/s: 1.0.1  (was: 1.1.0)

> Improve performance of the BufferPool management of Ozone client
> 
>
> Key: HDDS-4119
> URL: https://issues.apache.org/jira/browse/HDDS-4119
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Blocker
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Description: 
Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 

for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);
ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
The executorService in DatanodeStateMachine is now 
Executors.newFixedThreadPool(...), whose pool size is 2, one for Recon, another 
for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock().

Here is the problem: if setup one Recon and one SCM, then shutdown the Recon 
server, all Datanodes will be stale/dead very soon. The root cause is that, the 
thread working for Recon will retry while holding the lock of 
EndpointStateMachine for Recon, when DatanodeStateMachine schedule the next 
round of task, the other thread is blocked by waiting for the lock of 
EndpointStateMachine for Recon.

 

Since DatanodeStateMachine will periodically schedule tasks, we may adjust 
RetryPolicy so that the execution of tasks no need to be longer than 1min.

 

  was:
Teragen reported to be slow with low number of mappers compared to HDFS.

In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins but 
with Ozone it was 6 mins. It could be fixed with using more mappers, but when I 
investigated the execution I found a few problems reagrding to the BufferPool 
management.

 1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
itself is incremental
 2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
which can be a slow operation (positions should be calculated).
 3. There is no explicit support for write(byte) operations

In the flamegraph it's clearly visible that with low number of mappers the 
client is busy with buffer operations. After the patch the rpc call and the 
checksum calculation give the majority of the time. 


> Adjust RetryPolicy of SCMConnectionManager
> --
>
> Key: HDDS-4186
> URL: https://issues.apache.org/jira/browse/HDDS-4186
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Blocker
>
> Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
>  RetryPolicies.retryForeverWithFixedSleep(
>  1000, TimeUnit.MILLISECONDS);
> StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> StorageContainerDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();{code}
>  
> for Recon is retryUpToMaximumCountWithFixedSleep:
> {code:java}
> RetryPolicy retryPolicy =
> RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
> 6, TimeUnit.MILLISECONDS);
> ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
> ReconDatanodeProtocolPB.class, version,
> address, UserGroupInformation.getCurrentUser(), hadoopConfig,
> NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
> retryPolicy).getProxy();
> {code}
> The executorService in DatanodeStateMachine is now 
> Executors.newFixedThreadPool(...), whose pool size is 2, one for Recon, 
> another for SCM.
>  
> When encounter rpc failure, call() of RegisterEndpointTask, 
> VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
> rpcEndpoint.lock().
> Here is the problem: if setup one Recon and one SCM, then shutdown the Recon 
> server, all Datanodes will be stale/dead very soon. The root cause is that, 
> the thread working for Recon will retry while holding the lock of 
> EndpointStateMachine for Recon, when DatanodeStateMachine schedule the next 
> round of task, the other thread is blocked by waiting for the lock of 
> EndpointStateMachine for Recon.
>  
> Since DatanodeStateMachine will periodically schedule

[jira] [Updated] (HDDS-4186) Adjust RetryPolicy of SCMConnectionManager for SCM/Recon

2020-09-01 Thread Glen Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4186:

Description: 
*The problem is:*

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

If Recon is down, the thread running Recon task will retry due to rpc failure, 
meanwhile holds the lock of EndpointStateMachine for Recon. When 
DatanodeStateMachine schedule the next round of SCM/Recon task, the only left 
thread will be assigned to run Recon task, and blocked at waiting for the lock 
of EndpointStateMachine for Recon.
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();
  ...{code}
 

*The solution is:*

Since DatanodeStateMachine will periodically schedule SCM/Recon tasks, we may 
adjust RetryPolicy so that won't retry for longer that 1min. 

 

*The change has no side effect:*

1) VersionEndpointTask.call() is fine

2) RegisterEndpointTask.call() will query containerReport, nodeReport, 
pipelineReports from OzoneContainer, which is fine.

3) HeartbeatEndpointTask.call() will putBackReports(), which is fine.

 

  was:
*The problem is:*

If setup one Recon and one SCM, then shutdown the Recon server, all Datanodes 
will be stale/dead very soon at SCM side.

 

*The root cause is:*

Current RetryPolicy of Datanode for SCM is retryForeverWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
 RetryPolicies.retryForeverWithFixedSleep(
 1000, TimeUnit.MILLISECONDS);

StorageContainerDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
StorageContainerDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();{code}
 that for Recon is retryUpToMaximumCountWithFixedSleep:
{code:java}
RetryPolicy retryPolicy =
RetryPolicies.retryUpToMaximumCountWithFixedSleep(10,
6, TimeUnit.MILLISECONDS);

ReconDatanodeProtocolPB rpcProxy = RPC.getProtocolProxy(
ReconDatanodeProtocolPB.class, version,
address, UserGroupInformation.getCurrentUser(), hadoopConfig,
NetUtils.getDefaultSocketFactory(hadoopConfig), getRpcTimeout(),
retryPolicy).getProxy();
{code}
 

The executorService in DatanodeStateMachine is 
Executors.newFixedThreadPool(...), whose default pool size is 2, one for Recon, 
another for SCM.

 

When encounter rpc failure, call() of RegisterEndpointTask, 
VersionEndpointTask, HeartbeatEndpointTask will retry while holding the 
rpcEndpoint.lock(). For example:
{code:java}
public EndpointStateMachine.EndPointStates call() throws Exception {
  rpcEndpoint.lock();

  try {


SCMHeartbeatResponseProto reponse = rpcEndpoint.getEndPoint()
.sendHeartbeat(request);


  } finally {
rpcEndpoint.unlock();
  }

  return rpcEndpoint.getState();
}
{code}
 

The thread running Recon task will retry due to rpc failure, meanwhile holds 
the lock of EndpointStateMachine for Recon. When DatanodeStateMachine schedule 
the next round of SCM/Recon task, the only left thread will be assigned to run 
Recon task, and blocked at waiting for the lock of EndpointStateMachine for 
Recon.
{code:java}
public EndpointStateMachine.EndPointStates call() throws

[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #1298: HDDS-3869. Use different column families for datanode block and metadata

2020-09-01 Thread GitBox



hanishakoneru commented on a change in pull request #1298:
URL: https://github.com/apache/hadoop-ozone/pull/1298#discussion_r476042841



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerDataYaml.java
##
@@ -280,6 +280,9 @@ public Object construct(Node node) {
 String state = (String) nodes.get(OzoneConsts.STATE);
 kvData
 .setState(ContainerProtos.ContainerDataProto.State.valueOf(state));
+String schemaVersion = (String) nodes.get(OzoneConsts.SCHEMA_VERSION);
+kvData.setSchemaVersion(schemaVersion);

Review comment:
   When reading old containerDataYaml which does not container the Schema 
version field, what value would be returned? IIRC and it returns null, then we 
should set it to version V1.

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -159,122 +178,126 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
+if (kvContainerData.getSchemaVersion() == null) {
+  // If this container has not specified a schema version, it is in the old
+  // format with one default column family.
+  kvContainerData.setSchemaVersion(OzoneConsts.SCHEMA_V1);
+}
+
 
 boolean isBlockMetadataSet = false;
 
 try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
 config)) {
 
+  Table metadataTable =
+  containerDB.getStore().getMetadataTable();
+
   // Set pending deleted block count.
-  byte[] pendingDeleteBlockCount =
-  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  Long pendingDeleteBlockCount =
+  metadataTable.get(OzoneConsts.PENDING_DELETE_BLOCK_COUNT);
   if (pendingDeleteBlockCount != null) {
 kvContainerData.incrPendingDeletionBlocks(
-Longs.fromByteArray(pendingDeleteBlockCount));
+pendingDeleteBlockCount.intValue());

Review comment:
   Any reason for using intValue here instead of the long value as 
incrPendingDeletionBlocks takes in a long parameter?

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/BlockManagerImpl.java
##
@@ -262,14 +264,17 @@ public void deleteBlock(Container container, BlockID 
blockID) throws
   getBlockByID(db, blockID);
 
   // Update DB to delete block and set block count and bytes used.
-  BatchOperation batch = new BatchOperation();
-  batch.delete(blockKey);

Review comment:
   blockKey variable is redundant now and can be removed.

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java
##
@@ -487,6 +486,7 @@ public void importContainerData(InputStream input,
   containerData.setState(originalContainerData.getState());
   containerData
   .setContainerDBType(originalContainerData.getContainerDBType());
+  containerData.setSchemaVersion(originalContainerData.getSchemaVersion());

Review comment:
   I see that schema version is being set in 
KeyValueContainerUtil#parseKVContainerData. 
   We can explore the option of setting the default schema version (V1) while 
reading the Yaml itself so that it is never missed. 

##
File path: 
hadoop-hdds/container-service/src/test/resources/123-dn-container.db/LOG
##
@@ -0,0 +1,284 @@
+2020/08/03-15:13:40.359520 7f80eb7a9700 RocksDB version: 6.8.1
+2020/08/03-15:13:40.359563 7f80eb7a9700 Git sha rocksdb_build_git_sha:
+2020/08/03-15:13:40.359566 7f80eb7a9700 Compile date Apr 26 2020

Review comment:
   LOCK and LOG files are not required to load the  DB.

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -159,122 +178,126 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
+if (kvContainerData.getSchemaVersion() == null) {
+  // If this container has not specified a schema version, it is in the old
+  // format with one default column family.
+  kvContainerData.setSchemaVersion(OzoneConsts.SCHEMA_V1);
+}
+
 
 boolean isBlockMetadataSet = false;
 
 try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
 config)) {
 
+  Table metadataTable =
+  containerDB.getStore().getMetadataTable();
+
   // Set pending deleted block count.
-  byte[] pendingDeleteBlockCount =
-  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  Long pendingDeleteBlockCount =
+  metadataTable.get(OzoneConsts.PENDING_DELETE_BLOCK_COUNT);
   if (pendingDeleteBlockCount != null) {

[jira] [Assigned] (HDDS-4173) Implement HDDS Version management using the LayoutVersionManager interface.

2020-09-01 Thread Siddharth Wagle (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-4173:
-

Assignee: Prashant Pogde

> Implement HDDS Version management using the LayoutVersionManager interface.
> ---
>
> Key: HDDS-4173
> URL: https://issues.apache.org/jira/browse/HDDS-4173
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Affects Versions: 1.1.0
>Reporter: Aravindan Vijayan
>Assignee: Prashant Pogde
>Priority: Major
> Fix For: 1.1.0
>
>
> * Create HDDS Layout Feature Catalog similar to the OM Layout Feature Catalog.
> * Any layout change to SCM and Datanode needs to be recorded here as a Layout 
> Feature.
> * This includes new SCM HA requests, new container layouts in DN etc.
> * Create a HDDSLayoutVersionManager similar to OMLayoutVersionManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.

2020-09-01 Thread GitBox



bharatviswa504 commented on pull request #1328:
URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217


   @elek 
   There is no issue with this. As the other argument is when the flag is 
disabled to support HCFS.
   
   And I believe we have agreed on the part we need this like 100% HCFS with 
few changes to AWS S3 semantics.
   
   Let me know if you have any concerns.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.

2020-09-01 Thread GitBox



bharatviswa504 edited a comment on pull request #1328:
URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217


   @elek 
   There is no issue with this, and I believe we have agreed on the part we 
need this like 100% HCFS with few changes to AWS S3 semantics. I don't see what 
is the problem in moving forward?. 
   
   As the other argument is when the flag is disabled to support HCFS.
   
   
   Let me know if you have any concerns with this PR.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.

2020-09-01 Thread GitBox



bharatviswa504 edited a comment on pull request #1328:
URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217


   @elek 
   There is no issue with this. As the other argument is when the flag is 
disabled to support HCFS.
   
   And I believe we have agreed on the part we need this like 100% HCFS with 
few changes to AWS S3 semantics. I don't see what is the problem in moving 
forward?
   
   Let me know if you have any concerns.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op

2020-09-01 Thread Arpit Agarwal (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188586#comment-17188586
 ] 

Arpit Agarwal commented on HDDS-4097:
-

If you always create them, then you are basically interpreting key names as 
filesystem paths, so then they have to normalized and interpreted as paths. 
There is no middle ground.

> S3/Ozone Filesystem inter-op
> 
>
> Key: HDDS-4097
> URL: https://issues.apache.org/jira/browse/HDDS-4097
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem 
> path enabled.xlsx
>
>
> This Jira is to implement changes required to use Ozone buckets when data is 
> ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial 
> implementation for this is done as part of HDDS-3955. There are few API's 
> which have missed the changes during the implementation of HDDS-3955. 
> Attached design document which discusses each API,  and what changes are 
> required.
> Excel sheet has information about each API, from what all interfaces the OM 
> API is used, and what changes are required for the API to support 
> inter-operability.
> Note: The proposal for delete/rename is still under discussion, not yet 
> finalized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade

2020-09-01 Thread Aravindan Vijayan (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188636#comment-17188636
 ] 

Aravindan Vijayan edited comment on HDDS-4181 at 9/1/20, 4:49 PM:
--

Thanks [~elek]. I have seen those changes, and planning to build on top of that.


was (Author: avijayan):
Thanks [~elek]. I have seen that changes, and planning to build on top of that.

> Add acceptance tests for upgrade, finalization and downgrade
> 
>
> Key: HDDS-4181
> URL: https://issues.apache.org/jira/browse/HDDS-4181
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Priority: Major
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade

2020-09-01 Thread Aravindan Vijayan (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188636#comment-17188636
 ] 

Aravindan Vijayan commented on HDDS-4181:
-

Thanks [~elek]. I have seen that changes, and planning to build on top of that.

> Add acceptance tests for upgrade, finalization and downgrade
> 
>
> Key: HDDS-4181
> URL: https://issues.apache.org/jira/browse/HDDS-4181
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Priority: Major
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4175) Implement Datanode Finalization

2020-09-01 Thread Siddharth Wagle (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-4175:
-

Assignee: Istvan Fajth

> Implement Datanode Finalization
> ---
>
> Key: HDDS-4175
> URL: https://issues.apache.org/jira/browse/HDDS-4175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Aravindan Vijayan
>Assignee: Istvan Fajth
>Priority: Major
> Fix For: 1.1.0
>
>
> * Create FinalizeCommand in SCM and Datanode protocol.
> * Create FinalizeCommand Handler in Datanode.
> * Datanode Finalization should FAIL if there are open containers on it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4174) Add current HDDS layout version to Datanode heartbeat and registration.

2020-09-01 Thread Siddharth Wagle (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-4174:
-

Assignee: Prashant Pogde

> Add current HDDS layout version to Datanode heartbeat and registration.
> ---
>
> Key: HDDS-4174
> URL: https://issues.apache.org/jira/browse/HDDS-4174
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Aravindan Vijayan
>Assignee: Prashant Pogde
>Priority: Major
> Fix For: 1.1.0
>
>
> Add the layout version as a field to proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-4178) SCM Finalize command implementation.

2020-09-01 Thread Siddharth Wagle (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-4178:
-

Assignee: Istvan Fajth

> SCM Finalize command implementation.
> 
>
> Key: HDDS-4178
> URL: https://issues.apache.org/jira/browse/HDDS-4178
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Aravindan Vijayan
>Assignee: Istvan Fajth
>Priority: Major
> Fix For: 1.1.0
>
>
> * RPC endpoint implementation
> * Ratis request to persist MLV, Trigger DN Finalize, Pipeline close. (WHEN 
> MLV changes)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.

2020-09-01 Thread GitBox



bharatviswa504 edited a comment on pull request #1328:
URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217


   @elek 
   There is no issue with this, and I believe we have agreed on the part we 
need this like 100% HCFS with few changes to AWS S3 semantics. I don't see what 
is the problem in moving forward?. 
   
   As the other argument is when the flag is disabled to support compromised 
HCFS and 100% AWS.
   
   
   Let me know if you have any concerns about this PR.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4187) Fix memory leak of recon

2020-09-01 Thread runzhiwang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-4187:
-
Description: 
40 datanodes with 400, 000 containers, start recon with xmx:10G. After several 
hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and 
the reason is recon is slow to process ContainerReplicaReport, so the queue of 
thread OOM.

 

> Fix memory leak of recon
> 
>
> Key: HDDS-4187
> URL: https://issues.apache.org/jira/browse/HDDS-4187
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>
> 40 datanodes with 400, 000 containers, start recon with xmx:10G. After 
> several hours, recon's memory increase to 12G and OOM. Memory leak happens on 
> heap, and the reason is recon is slow to process ContainerReplicaReport, so 
> the queue of thread OOM.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op

2020-09-01 Thread Marton Elek (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188536#comment-17188536
 ] 

Marton Elek commented on HDDS-4097:
---

[~arp] It was discussed with more details during the community sync (the 
recording is shared in the ozone-dev mailing list).

In short, my proposal is the following:

1. Using simple, acceptable key names (/a/b/c, /a/b/c/d) *both s3 and HCFS 
should work out-of-the box, without any additional settings*. (Based on my 
understanding this is not true today as we need to turn on 
`ozone.om.enable.filesystem.paths` to get intermediate directories)

2. There are some conflicts between AWS S3 / HCFS interface. We need a new 
option to express how to resolve the conflicts. Let's say we have 
ozone.key.compatibility settings.

 a) ozone.key.compatibility=aws means that we enable (almost) everything which 
is enabled by aws s3, but we couldn't show all the keys in the hadoop 
interaface. For example if directory and key are created with the same prefix 
(possible with AWS S3), HCFS will show only the directory, not the key. 

 b) ozone.key.compatibility=hadoop is the opposite, we can validate the path, 
and throw an exception on s3 interface if dir/key are created with the same name

> S3/Ozone Filesystem inter-op
> 
>
> Key: HDDS-4097
> URL: https://issues.apache.org/jira/browse/HDDS-4097
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem 
> path enabled.xlsx
>
>
> This Jira is to implement changes required to use Ozone buckets when data is 
> ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial 
> implementation for this is done as part of HDDS-3955. There are few API's 
> which have missed the changes during the implementation of HDDS-3955. 
> Attached design document which discusses each API,  and what changes are 
> required.
> Excel sheet has information about each API, from what all interfaces the OM 
> API is used, and what changes are required for the API to support 
> inter-operability.
> Note: The proposal for delete/rename is still under discussion, not yet 
> finalized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-01 Thread GitBox



timmylicheng commented on a change in pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481135941



##
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
##
@@ -560,6 +563,13 @@ public boolean getReplicationManagerStatus() {
 return scm.getReplicationManager().isRunning();
   }
 
+  @Override
+  public List getScmRatisStatus() throws IOException {
+return scm.getScmHAManager()
+.getRatisServer().getRaftPeers()
+.stream().map(peer -> peer.getAddress()).collect(Collectors.toList());

Review comment:
   Could potentially throw NPE





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op

2020-09-01 Thread Arpit Agarwal (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188538#comment-17188538
 ] 

Arpit Agarwal commented on HDDS-4097:
-

bq. Using simple, acceptable key names (/a/b/c, /a/b/c/d) both s3 and HCFS 
should work out-of-the box, without any additional settings. 
Unfortunately there is no way you cannot guarantee that. A filesystem client 
will need all the intermediate directories to exist for navigating the tree.

> S3/Ozone Filesystem inter-op
> 
>
> Key: HDDS-4097
> URL: https://issues.apache.org/jira/browse/HDDS-4097
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem 
> path enabled.xlsx
>
>
> This Jira is to implement changes required to use Ozone buckets when data is 
> ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial 
> implementation for this is done as part of HDDS-3955. There are few API's 
> which have missed the changes during the implementation of HDDS-3955. 
> Attached design document which discusses each API,  and what changes are 
> required.
> Excel sheet has information about each API, from what all interfaces the OM 
> API is used, and what changes are required for the API to support 
> inter-operability.
> Note: The proposal for delete/rename is still under discussion, not yet 
> finalized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op

2020-09-01 Thread Marton Elek (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188544#comment-17188544
 ] 

Marton Elek commented on HDDS-4097:
---

> Unfortunately there is no way you can guarantee that. A filesystem client 
> will need all the intermediate directories to exist for navigating the tree.

Is there any problem with always creating the intermediate directories? I see 
some possible, minor performance problems but as RocksDB is already the fastest 
part shouldn't be a blocker. Especially as we can support both S3 and HCFS with 
this approach.

> S3/Ozone Filesystem inter-op
> 
>
> Key: HDDS-4097
> URL: https://issues.apache.org/jira/browse/HDDS-4097
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem 
> path enabled.xlsx
>
>
> This Jira is to implement changes required to use Ozone buckets when data is 
> ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial 
> implementation for this is done as part of HDDS-3955. There are few API's 
> which have missed the changes during the implementation of HDDS-3955. 
> Attached design document which discusses each API,  and what changes are 
> required.
> Excel sheet has information about each API, from what all interfaces the OM 
> API is used, and what changes are required for the API to support 
> inter-operability.
> Note: The proposal for delete/rename is still under discussion, not yet 
> finalized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-4187) Fix memory leak of recon

2020-09-01 Thread runzhiwang (Jira)

runzhiwang created HDDS-4187:


 Summary: Fix memory leak of recon
 Key: HDDS-4187
 URL: https://issues.apache.org/jira/browse/HDDS-4187
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.

2020-09-01 Thread Zheng Huang-Mu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Huang-Mu updated HDDS-4188:
-
Attachment: brokenLink.png

> Fix Chinese document broken link.
> -
>
> Key: HDDS-4188
> URL: https://issues.apache.org/jira/browse/HDDS-4188
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Zheng Huang-Mu
>Priority: Minor
> Attachments: brokenLink.png
>
>
> In Chinese document *Home/概念/概览*,
>  There is a broken link.
>  Look attachment screenshot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.

2020-09-01 Thread Zheng Huang-Mu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Huang-Mu updated HDDS-4188:
-
Description: 
In Chinese document *Home/概念/概览*,
 There is a broken link.
 Look attachment screenshot.

  was:
In Chinese document *Home/概念/概览*,
 There is a broken link.
 Look attachment screenshot.

!brokenLink.png!


> Fix Chinese document broken link.
> -
>
> Key: HDDS-4188
> URL: https://issues.apache.org/jira/browse/HDDS-4188
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Zheng Huang-Mu
>Priority: Minor
> Attachments: brokenLink.png
>
>
> In Chinese document *Home/概念/概览*,
>  There is a broken link.
>  Look attachment screenshot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 138 matches

Mail list logo