[GitHub] [hadoop-ozone] amaliujia commented on pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#issuecomment-679615937 R: @timmylicheng This is still WIP. Can you give some suggestions. Just want to make sure I am on the right track. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia opened a new pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia opened a new pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346 ## What changes were proposed in this pull request? CLI command to show current SCM leader and follower status. E.g. `ozone admin scmha listratisstatus` ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4115 ## How was this patch tested? Unit Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4115) CLI command to show current SCM leader and follower status
[ https://issues.apache.org/jira/browse/HDDS-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4115: - Labels: pull-request-available (was: ) > CLI command to show current SCM leader and follower status > -- > > Key: HDDS-4115 > URL: https://issues.apache.org/jira/browse/HDDS-4115 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Li Cheng >Assignee: Rui Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1324: HDDS-4068. Client should not retry same OM on network connection failure
bharatviswa504 commented on a change in pull request #1324: URL: https://github.com/apache/hadoop-ozone/pull/1324#discussion_r476093708 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -172,9 +174,12 @@ private OzoneManagerProtocolPB createOMProxy(InetSocketAddress omAddress) LegacyHadoopConfigurationSource.asHadoopConfiguration(conf); RPC.setProtocolEngine(hadoopConf, OzoneManagerProtocolPB.class, ProtobufRpcEngine.class); -return RPC.getProxy(OzoneManagerProtocolPB.class, omVersion, omAddress, ugi, -hadoopConf, NetUtils.getDefaultSocketFactory(hadoopConf), -(int) OmUtils.getOMClientRpcTimeOut(conf)); +RetryPolicy connectionRetryPolicy = RetryPolicies +.failoverOnNetworkException(0); +return RPC.getProtocolProxy(OzoneManagerProtocolPB.class, omVersion, +omAddress, ugi, hadoopConf, NetUtils.getDefaultSocketFactory( +hadoopConf), (int) OmUtils.getOMClientRpcTimeOut(conf), +connectionRetryPolicy).getProxy(); Review comment: Yes. As failoverOnNetworkException uses fallback as TRY_ONCE_THEN_FAIL and maxFailOvers is zero, so it is like TRY_ONCE_THEN_FAIL, as in shouldretry it will fail in below part shouldRetry i think. ``` if (failovers >= maxFailovers) { return new RetryAction(RetryAction.RetryDecision.FAIL, 0, "failovers (" + failovers + ") exceeded maximum allowed (" + maxFailovers + ")"); } ``` So, tehnically we are using it as similar to TRY_ONCE_THEN_FAIL in this scenario. ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -172,9 +174,12 @@ private OzoneManagerProtocolPB createOMProxy(InetSocketAddress omAddress) LegacyHadoopConfigurationSource.asHadoopConfiguration(conf); RPC.setProtocolEngine(hadoopConf, OzoneManagerProtocolPB.class, ProtobufRpcEngine.class); -return RPC.getProxy(OzoneManagerProtocolPB.class, omVersion, omAddress, ugi, -hadoopConf, NetUtils.getDefaultSocketFactory(hadoopConf), -(int) OmUtils.getOMClientRpcTimeOut(conf)); +RetryPolicy connectionRetryPolicy = RetryPolicies +.failoverOnNetworkException(0); +return RPC.getProtocolProxy(OzoneManagerProtocolPB.class, omVersion, +omAddress, ugi, hadoopConf, NetUtils.getDefaultSocketFactory( +hadoopConf), (int) OmUtils.getOMClientRpcTimeOut(conf), +connectionRetryPolicy).getProxy(); Review comment: Yes. As failoverOnNetworkException uses fallback as TRY_ONCE_THEN_FAIL and maxFailOvers is zero, so it is like TRY_ONCE_THEN_FAIL, as in shouldretry it will fail in below part shouldRetry i think. ``` if (failovers >= maxFailovers) { return new RetryAction(RetryAction.RetryDecision.FAIL, 0, "failovers (" + failovers + ") exceeded maximum allowed (" + maxFailovers + ")"); } ``` So, technically we are using it as similar to TRY_ONCE_THEN_FAIL in this scenario. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1324: HDDS-4068. Client should not retry same OM on network connection failure
bharatviswa504 commented on a change in pull request #1324: URL: https://github.com/apache/hadoop-ozone/pull/1324#discussion_r476093708 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -172,9 +174,12 @@ private OzoneManagerProtocolPB createOMProxy(InetSocketAddress omAddress) LegacyHadoopConfigurationSource.asHadoopConfiguration(conf); RPC.setProtocolEngine(hadoopConf, OzoneManagerProtocolPB.class, ProtobufRpcEngine.class); -return RPC.getProxy(OzoneManagerProtocolPB.class, omVersion, omAddress, ugi, -hadoopConf, NetUtils.getDefaultSocketFactory(hadoopConf), -(int) OmUtils.getOMClientRpcTimeOut(conf)); +RetryPolicy connectionRetryPolicy = RetryPolicies +.failoverOnNetworkException(0); +return RPC.getProtocolProxy(OzoneManagerProtocolPB.class, omVersion, +omAddress, ugi, hadoopConf, NetUtils.getDefaultSocketFactory( +hadoopConf), (int) OmUtils.getOMClientRpcTimeOut(conf), +connectionRetryPolicy).getProxy(); Review comment: Yes. As failoverOnNetworkException uses fallback as TRY_ONCE_THEN_FAIL and maxFailOvers is zero, so it is like TRY_ONCE_THEN_FAIL, as in shouldretry it will fail in below part shouldRetry i think. ``` if (failovers >= maxFailovers) { return new RetryAction(RetryAction.RetryDecision.FAIL, 0, "failovers (" + failovers + ") exceeded maximum allowed (" + maxFailovers + ")"); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #1324: HDDS-4068. Client should not retry same OM on network connection failure
hanishakoneru commented on a change in pull request #1324: URL: https://github.com/apache/hadoop-ozone/pull/1324#discussion_r475972003 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -172,9 +174,12 @@ private OzoneManagerProtocolPB createOMProxy(InetSocketAddress omAddress) LegacyHadoopConfigurationSource.asHadoopConfiguration(conf); RPC.setProtocolEngine(hadoopConf, OzoneManagerProtocolPB.class, ProtobufRpcEngine.class); -return RPC.getProxy(OzoneManagerProtocolPB.class, omVersion, omAddress, ugi, -hadoopConf, NetUtils.getDefaultSocketFactory(hadoopConf), -(int) OmUtils.getOMClientRpcTimeOut(conf)); +RetryPolicy connectionRetryPolicy = RetryPolicies +.failoverOnNetworkException(0); +return RPC.getProtocolProxy(OzoneManagerProtocolPB.class, omVersion, +omAddress, ugi, hadoopConf, NetUtils.getDefaultSocketFactory( +hadoopConf), (int) OmUtils.getOMClientRpcTimeOut(conf), +connectionRetryPolicy).getProxy(); Review comment: It would be the same as the current one, right? Would it suffice to add a comment to explain the retry policy? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1324: HDDS-4068. Client should not retry same OM on network connection failure
bharatviswa504 commented on a change in pull request #1324: URL: https://github.com/apache/hadoop-ozone/pull/1324#discussion_r475966746 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -172,9 +174,12 @@ private OzoneManagerProtocolPB createOMProxy(InetSocketAddress omAddress) LegacyHadoopConfigurationSource.asHadoopConfiguration(conf); RPC.setProtocolEngine(hadoopConf, OzoneManagerProtocolPB.class, ProtobufRpcEngine.class); -return RPC.getProxy(OzoneManagerProtocolPB.class, omVersion, omAddress, ugi, -hadoopConf, NetUtils.getDefaultSocketFactory(hadoopConf), -(int) OmUtils.getOMClientRpcTimeOut(conf)); +RetryPolicy connectionRetryPolicy = RetryPolicies +.failoverOnNetworkException(0); +return RPC.getProtocolProxy(OzoneManagerProtocolPB.class, omVersion, +omAddress, ugi, hadoopConf, NetUtils.getDefaultSocketFactory( +hadoopConf), (int) OmUtils.getOMClientRpcTimeOut(conf), +connectionRetryPolicy).getProxy(); Review comment: One question, can we use RetryPolicy TRY_ONCE_THEN_FAIL here? Because in this failoverOnNetworkException, also we set retry count to zero and maxFailOvers to zero. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3927) Add OZONE_MANAGER_OPTS and OZONE_DATANODE_OPTS
[ https://issues.apache.org/jira/browse/HDDS-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDDS-3927: - Description: Similar to {{HDFS_NAMENODE_OPTS}}, {{HDFS_DATANODE_OPTS}}, etc., we should have {{OZONE_MANAGER_OPTS}}, {{OZONE_DATANODE_OPTS}} to allow adding JVM args for GC tuning and debugging. CC [~bharat] was:Similar to {{HDFS_NAMENODE_OPTS}}, {{HDFS_DATANODE_OPTS}}, etc., we should have {{OZONE_MANAGER_OPTS}}, {{OZONE_DATANODE_OPTS}} to allow adding JVM args for GC tuning and debugging. > Add OZONE_MANAGER_OPTS and OZONE_DATANODE_OPTS > -- > > Key: HDDS-3927 > URL: https://issues.apache.org/jira/browse/HDDS-3927 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > > Similar to {{HDFS_NAMENODE_OPTS}}, {{HDFS_DATANODE_OPTS}}, etc., we should > have {{OZONE_MANAGER_OPTS}}, {{OZONE_DATANODE_OPTS}} to allow adding JVM args > for GC tuning and debugging. > CC [~bharat] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] smengcl commented on pull request #1088: HDDS-3805. [OFS] Remove usage of OzoneClientAdapter interface
smengcl commented on pull request #1088: URL: https://github.com/apache/hadoop-ozone/pull/1088#issuecomment-679346944 After a short discussion with @arp7 and @adoroszlai I think for this PR we won't merge the two classes (`BasicRootedOzoneFileSystem` and `BasicRootedOzoneFileSystemImpl`). I'm opening another jira for this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4139) Update version number in upgrade tests
[ https://issues.apache.org/jira/browse/HDDS-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4139: --- Status: Patch Available (was: In Progress) > Update version number in upgrade tests > -- > > Key: HDDS-4139 > URL: https://issues.apache.org/jira/browse/HDDS-4139 > Project: Hadoop Distributed Data Store > Issue Type: Wish > Components: test >Affects Versions: 0.6.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > > Ozone 0.6.0 release is renamed to Ozone 1.0.0, but there are a few leftover > references to 0.6.0, mostly in {{upgrade}} acceptance test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-3830) Introduce OM layout version 'v0'.
[ https://issues.apache.org/jira/browse/HDDS-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-3830: --- Assignee: Stephen O'Donnell (was: Aravindan Vijayan) > Introduce OM layout version 'v0'. > - > > Key: HDDS-3830 > URL: https://issues.apache.org/jira/browse/HDDS-3830 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Stephen O'Donnell >Priority: Major > Labels: upgrade-p0 > > The first layout version for OzoneManager will be '0' which will be written > to the version file. Until a future Ozone release with Upgrade & Finalize > support, this will just be a dummy number, to support backward compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4143) Implement a version factory for OM Apply Transaction that uses the implementation based on layout version.
[ https://issues.apache.org/jira/browse/HDDS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4143: Summary: Implement a version factory for OM Apply Transaction that uses the implementation based on layout version. (was: Implement a ) > Implement a version factory for OM Apply Transaction that uses the > implementation based on layout version. > -- > > Key: HDDS-4143 > URL: https://issues.apache.org/jira/browse/HDDS-4143 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 0.7.0 > > > Add the current layout version (MLV) to the OM Ratis request. If there is no > layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4143) Implement a version factory for OM Apply Transaction that uses the implementation based on layout version.
[ https://issues.apache.org/jira/browse/HDDS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4143: Parent: HDDS-3698 Issue Type: Sub-task (was: Task) > Implement a version factory for OM Apply Transaction that uses the > implementation based on layout version. > -- > > Key: HDDS-4143 > URL: https://issues.apache.org/jira/browse/HDDS-4143 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 0.7.0 > > > Add the current layout version (MLV) to the OM Ratis request. If there is no > layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3881) Add current layout version to OM Ratis Request.
[ https://issues.apache.org/jira/browse/HDDS-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-3881: Description: To make sure the correct version of the applyTxn step is executed against the request, we should add the version to the OM Request and use that version in the applyTxn step. Add the current layout version (MLV) to the OM Ratis request. If there is no layout version present, we can default to '0'. (was: To make sure the correct version of the applyTxn step is executed against the request, we should add the version to the OM Request and use that version in the applyTxn step. ) > Add current layout version to OM Ratis Request. > --- > > Key: HDDS-3881 > URL: https://issues.apache.org/jira/browse/HDDS-3881 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Priority: Major > > To make sure the correct version of the applyTxn step is executed against the > request, we should add the version to the OM Request and use that version in > the applyTxn step. Add the current layout version (MLV) to the OM Ratis > request. If there is no layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4143) Implement a
[ https://issues.apache.org/jira/browse/HDDS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4143: Summary: Implement a (was: Introduce version in OM Ratis request.) > Implement a > > > Key: HDDS-4143 > URL: https://issues.apache.org/jira/browse/HDDS-4143 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 0.7.0 > > > Add the current layout version (MLV) to the OM Ratis request. If there is no > layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-3881) Add current layout version to OM Ratis Request.
[ https://issues.apache.org/jira/browse/HDDS-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-3881: --- Assignee: Prashant Pogde > Add current layout version to OM Ratis Request. > --- > > Key: HDDS-3881 > URL: https://issues.apache.org/jira/browse/HDDS-3881 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > > To make sure the correct version of the applyTxn step is executed against the > request, we should add the version to the OM Request and use that version in > the applyTxn step. Add the current layout version (MLV) to the OM Ratis > request. If there is no layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4143) Introduce version in OM Ratis request.
Aravindan Vijayan created HDDS-4143: --- Summary: Introduce version in OM Ratis request. Key: HDDS-4143 URL: https://issues.apache.org/jira/browse/HDDS-4143 Project: Hadoop Distributed Data Store Issue Type: Task Components: Ozone Manager Reporter: Aravindan Vijayan Assignee: Prashant Pogde Fix For: 0.7.0 Add the current layout version (MLV) to the OM Ratis request. If there is no layout version present, we can default to '0'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4142) Expose upgrade related state through JMX
[ https://issues.apache.org/jira/browse/HDDS-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4142: Summary: Expose upgrade related state through JMX (was: Expose upgrade related state through CLI & JMX) > Expose upgrade related state through JMX > > > Key: HDDS-4142 > URL: https://issues.apache.org/jira/browse/HDDS-4142 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Istvan Fajth >Priority: Major > Fix For: 0.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4142) Expose upgrade related state through CLI & JMX
Aravindan Vijayan created HDDS-4142: --- Summary: Expose upgrade related state through CLI & JMX Key: HDDS-4142 URL: https://issues.apache.org/jira/browse/HDDS-4142 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Aravindan Vijayan Assignee: Istvan Fajth Fix For: 0.7.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4142) Expose upgrade related state through CLI & JMX
[ https://issues.apache.org/jira/browse/HDDS-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4142: Parent: HDDS-3698 Issue Type: Sub-task (was: Bug) > Expose upgrade related state through CLI & JMX > -- > > Key: HDDS-4142 > URL: https://issues.apache.org/jira/browse/HDDS-4142 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Istvan Fajth >Priority: Major > Fix For: 0.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4141) Implement Finalize command in Ozone Manager.
[ https://issues.apache.org/jira/browse/HDDS-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-4141: Description: * On the client side, add a new command to finalize OM through CLI. * On the server side, this finalize command should update the internal Upgrade state to "Finalized". This operation can be a No-Op if there are no layout changes across an upgrade. > Implement Finalize command in Ozone Manager. > > > Key: HDDS-4141 > URL: https://issues.apache.org/jira/browse/HDDS-4141 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Istvan Fajth >Priority: Major > Fix For: 0.7.0 > > > * On the client side, add a new command to finalize OM through CLI. > * On the server side, this finalize command should update the internal > Upgrade state to "Finalized". This operation can be a No-Op if there are no > layout changes across an upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4141) Implement Finalize command in Ozone Manager.
Aravindan Vijayan created HDDS-4141: --- Summary: Implement Finalize command in Ozone Manager. Key: HDDS-4141 URL: https://issues.apache.org/jira/browse/HDDS-4141 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Aravindan Vijayan Assignee: Istvan Fajth Fix For: 0.7.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3829) Introduce Layout Feature interface in Ozone.
[ https://issues.apache.org/jira/browse/HDDS-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-3829: Status: Patch Available (was: Open) > Introduce Layout Feature interface in Ozone. > > > Key: HDDS-3829 > URL: https://issues.apache.org/jira/browse/HDDS-3829 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] smengcl commented on pull request #1330: HDDS-4074. [OFS] Implement AbstractFileSystem for RootedOzoneFileSystem
smengcl commented on pull request #1330: URL: https://github.com/apache/hadoop-ozone/pull/1330#issuecomment-679274423 @elek Would you like to take a look at this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1271: HDDS-4020. ACL commands like getacl and setacl should return a response only when Native Authorizer is enabled.
bharatviswa504 commented on pull request #1271: URL: https://github.com/apache/hadoop-ozone/pull/1271#issuecomment-679274161 Closing this as HDDS-4089 is trying to fix acl commands when external authorizer is configured. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] arp7 commented on pull request #1088: HDDS-3805. [OFS] Remove usage of OzoneClientAdapter interface
arp7 commented on pull request #1088: URL: https://github.com/apache/hadoop-ozone/pull/1088#issuecomment-679270871 Folks - what is the benefit of splitting this into smaller Jiras? Is it just to make the code review easier? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] avijayanhwx commented on pull request #1322: HDDS-3829. Introduce Layout Feature interface in Ozone.
avijayanhwx commented on pull request #1322: URL: https://github.com/apache/hadoop-ozone/pull/1322#issuecomment-679268593 > Hi @avijayanhwx, > > thank you for starting this work, and posting an inital version of the core code for the upgrade support. > I have a few general questions and concerns, also added a few comments after a quick review. > > In HDFS the layoutversion is a monotonically decreasing number, we chose to use monotonically increasing version numbers, I am unsure why HDFS chose the negative numbers, can there be some hidden considerations, we might go through, before committing to the positive layoutversions and monotonic increase? I haven't found one. > > Also, in HDFS one layoutversion covers one feature, would it be better to do not let two feetures associated with one layoutversion number? What is the benefit of having two features mapped to the same layout version? I don't feel good about it, though I don't have a specific example yet where it can cause trouble. > > LayoutVersionManager is implemented in a way that it is fully static. I am unsure if this is a good design, I understand the intent that there has to be only one instance of this per component, and seeing it this way it is a reasonable choice to use static a implementation, but it can fire back later when we want to implement tests that involves changing something in this logic, and we can not freely and easily change the behaviour in tests as I fear, also it can introduce invisible interdependencies later that might be hard to detect/factor out. Implementing it in a non-static way does not seem to cause any drawback, even we can be fine with multiple instances for the same component, as the used values will be anyway hardcoded in the real system. What do you think, I would consider making it non-static, as I think it has more possibilities and less limitations later. > Hi @avijayanhwx, > > thank you for starting this work, and posting an inital version of the core code for the upgrade support. > I have a few general questions and concerns, also added a few comments after a quick review. > > In HDFS the layoutversion is a monotonically decreasing number, we chose to use monotonically increasing version numbers, I am unsure why HDFS chose the negative numbers, can there be some hidden considerations, we might go through, before committing to the positive layoutversions and monotonic increase? I haven't found one. > > Also, in HDFS one layoutversion covers one feature, would it be better to do not let two feetures associated with one layoutversion number? What is the benefit of having two features mapped to the same layout version? I don't feel good about it, though I don't have a specific example yet where it can cause trouble. > > LayoutVersionManager is implemented in a way that it is fully static. I am unsure if this is a good design, I understand the intent that there has to be only one instance of this per component, and seeing it this way it is a reasonable choice to use static a implementation, but it can fire back later when we want to implement tests that involves changing something in this logic, and we can not freely and easily change the behaviour in tests as I fear, also it can introduce invisible interdependencies later that might be hard to detect/factor out. Implementing it in a non-static way does not seem to cause any drawback, even we can be fine with multiple instances for the same component, as the used values will be anyway hardcoded in the real system. What do you think, I would consider making it non-static, as I think it has more possibilities and less limitations later. Regarding the HDFS layout feature versions, I could not find any document on why that was the implementation choice. So, I went with what I felt was more intuitive. I can change the implementation to have only 1 Layout Feature per version. I am looking into making the Version Manager class non-static. The only issue I see is that it may be used in many different code areas within a component, and hence may need to be passed around (if it cannot be instantiated without external help). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] smengcl closed pull request #1181: [POC] HDDS-3915. Simple trash emptier on OM
smengcl closed pull request #1181: URL: https://github.com/apache/hadoop-ozone/pull/1181 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] smengcl commented on pull request #1181: [POC] HDDS-3915. Simple trash emptier on OM
smengcl commented on pull request #1181: URL: https://github.com/apache/hadoop-ozone/pull/1181#issuecomment-679267871 Closing this PR as this is only a POC and won't be ever merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4139) Update version number in upgrade tests
[ https://issues.apache.org/jira/browse/HDDS-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4139: - Labels: pull-request-available (was: ) > Update version number in upgrade tests > -- > > Key: HDDS-4139 > URL: https://issues.apache.org/jira/browse/HDDS-4139 > Project: Hadoop Distributed Data Store > Issue Type: Wish > Components: test >Affects Versions: 0.6.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > > Ozone 0.6.0 release is renamed to Ozone 1.0.0, but there are a few leftover > references to 0.6.0, mostly in {{upgrade}} acceptance test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai opened a new pull request #1345: HDDS-4139. Update version number in upgrade tests
adoroszlai opened a new pull request #1345: URL: https://github.com/apache/hadoop-ozone/pull/1345 ## What changes were proposed in this pull request? Update version number `0.6.0` to `1.0.0` in upgrade scripts and tests (content and file/directory names). https://issues.apache.org/jira/browse/HDDS-4139 ## How was this patch tested? Ran `upgrade` acceptance test locally. https://github.com/adoroszlai/hadoop-ozone/runs/1022231789 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1322: HDDS-3829. Introduce Layout Feature interface in Ozone.
avijayanhwx commented on a change in pull request #1322: URL: https://github.com/apache/hadoop-ozone/pull/1322#discussion_r475741754 ## File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/s3/security/package-info.java ## @@ -19,4 +19,4 @@ /** * Package contains classes related to S3 security responses. */ -package org.apache.hadoop.ozone.om.request.s3.security; +package org.apache.hadoop.ozone.om.response.s3.security; Review comment: Agreed. I had to include this since the plugin that was added for scanning annotations fails the build due to this discrepancy. I will create a separate JIRA to fix this on master, but since it is a tiny change, I would like to leave it here until then to keep the build green. ## File path: hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/upgrade/TestOMVersionManager.java ## @@ -0,0 +1,63 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.ozone.om.upgrade; + +import static org.apache.hadoop.ozone.om.upgrade.OMLayoutFeatureCatalog.OMLayoutFeature.CREATE_EC; +import static org.apache.hadoop.ozone.om.upgrade.OMLayoutFeatureCatalog.OMLayoutFeature.INITIAL_VERSION; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +import java.io.IOException; + +import org.apache.hadoop.ozone.om.OMStorage; +import org.apache.hadoop.ozone.om.OzoneManager; +import org.apache.hadoop.ozone.om.upgrade.OMLayoutFeatureCatalog.OMLayoutFeature; +import org.apache.hadoop.ozone.upgrade.LayoutFeature; +import org.junit.Test; + +/** + * Test OM layout version management. + */ +public class TestOMVersionManager { + + @Test + public void testOMLayoutVersionManager() throws IOException { +OMStorage omStorage = mock(OMStorage.class); +when(omStorage.getLayoutVersion()).thenReturn(0); +OMVersionManager.init(omStorage); +assertTrue(OMVersionManager.isAllowed(INITIAL_VERSION)); +assertFalse(OMVersionManager.isAllowed(CREATE_EC)); +assertEquals(0, OMVersionManager.getMetadataLayoutVersion()); +assertTrue(OMVersionManager.needsFinalization()); +OMVersionManager.doFinalize(mock(OzoneManager.class)); Review comment: I am sure it is a leftover line ;) I will remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1322: HDDS-3829. Introduce Layout Feature interface in Ozone.
avijayanhwx commented on a change in pull request #1322: URL: https://github.com/apache/hadoop-ozone/pull/1322#discussion_r475740631 ## File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/upgrade/OMLayoutFeatureCatalog.java ## @@ -0,0 +1,71 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.ozone.om.upgrade; + +import java.util.Optional; + +import org.apache.hadoop.ozone.upgrade.LayoutFeature; + +/** + * Catalog of Ozone Manager features. + */ +public class OMLayoutFeatureCatalog { Review comment: Good point. I wanted to have an enclosing class for "ease" of understanding. We have a 'catalog' of features in OM. In the future, we can expect some helper methods to be added at the catalog level which may not be needed at the Feature level. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4140) Auto-close /pending pull requests after 21 days of inactivity
[ https://issues.apache.org/jira/browse/HDDS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4140: - Labels: pull-request-available (was: ) > Auto-close /pending pull requests after 21 days of inactivity > - > > Key: HDDS-4140 > URL: https://issues.apache.org/jira/browse/HDDS-4140 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > Earlier we introduced a way to mark the inactive pull requests with "pending" > label (with the help of /pending comment). > This pull requests introduce a new scheduled build which closes the "pending" > pull requests after 21 days of inactivity. > IMPORTANT: Only the pull requests which are pending on the author will be > closed. > We should NEVER close a pull requests which are waiting for the attention of > a committer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek opened a new pull request #1344: HDDS-4140. Auto-close /pending pull requests after 21 days of inactivity
elek opened a new pull request #1344: URL: https://github.com/apache/hadoop-ozone/pull/1344 ## What changes were proposed in this pull request? Earlier we introduced a way to mark the inactive pull requests with "pending" label (with the help of /pending comment). This pull requests introduce a new scheduled build which closes the "pending" pull requests after 21 days of inactivity. IMPORTANT: Only the pull requests which are pending on the author will be closed. We should NEVER close a pull requests which are waiting for the attention of a committer. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4140 ## How was this patch tested? on the https://github.com/elek/hadoop-ozone/ fork with this PR: https://github.com/elek/hadoop-ozone/pull/13 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4140) Auto-close /pending pull requests after 21 days of inactivity
Marton Elek created HDDS-4140: - Summary: Auto-close /pending pull requests after 21 days of inactivity Key: HDDS-4140 URL: https://issues.apache.org/jira/browse/HDDS-4140 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: build Reporter: Marton Elek Assignee: Marton Elek Earlier we introduced a way to mark the inactive pull requests with "pending" label (with the help of /pending comment). This pull requests introduce a new scheduled build which closes the "pending" pull requests after 21 days of inactivity. IMPORTANT: Only the pull requests which are pending on the author will be closed. We should NEVER close a pull requests which are waiting for the attention of a committer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-4020) ACL commands like getacl and setacl should return a response only when Native Authorizer is enabled
[ https://issues.apache.org/jira/browse/HDDS-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-4020. -- Resolution: Won't Fix > ACL commands like getacl and setacl should return a response only when Native > Authorizer is enabled > --- > > Key: HDDS-4020 > URL: https://issues.apache.org/jira/browse/HDDS-4020 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone CLI, Ozone Manager >Affects Versions: 0.5.0 >Reporter: Vivek Ratnavel Subramanian >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > > Currently, the getacl and setacl commands return wrong information when an > external authorizer such as Ranger is enabled. There should be a check to > verify if Native Authorizer is enabled before returning any response for > these two commands. > If an external authorizer is enabled, it should show a nice message about > managing acls in external authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 closed pull request #1271: HDDS-4020. ACL commands like getacl and setacl should return a response only when Native Authorizer is enabled.
bharatviswa504 closed pull request #1271: URL: https://github.com/apache/hadoop-ozone/pull/1271 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1337: HDDS-4129. change MAX_QUOTA_IN_BYTES to Long.MAX_VALUE.
bharatviswa504 commented on pull request #1337: URL: https://github.com/apache/hadoop-ozone/pull/1337#issuecomment-679223880 Closing this as HDDS-4089 planning to fix this issue to make ACL commands work with the external authorizer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #934: HDDS-3605. Support close all pipelines.
github-actions[bot] commented on pull request #934: URL: https://github.com/apache/hadoop-ozone/pull/934#issuecomment-679217966 Thank you very much for the patch. I am closing this PR __temporarily__ as there was no activity recently and it is waiting for response from its author. It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time. It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs. If you need ANY help to finish this PR, please [contact the community](https://github.com/apache/hadoop-ozone#contact) on the mailing list or the slack channel. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #934: HDDS-3605. Support close all pipelines.
github-actions[bot] closed pull request #934: URL: https://github.com/apache/hadoop-ozone/pull/934 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sadanand48 commented on pull request #1232: HDDS-3431. Enable TestContainerReplication test cases
sadanand48 commented on pull request #1232: URL: https://github.com/apache/hadoop-ozone/pull/1232#issuecomment-679216992 Thanks @adoroszlai for reviewing this. Sorry for the late reply . I had followed this comment ` Since datanode commands are added through event queue, onMessage method should take care of adding commands to command queue ` from `SCMNodeManager`. The test still works for me and i have also run it [here](https://github.com/sadanand48/hadoop-ozone/actions/runs/176961558). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on pull request #934: HDDS-3605. Support close all pipelines.
adoroszlai commented on pull request #934: URL: https://github.com/apache/hadoop-ozone/pull/934#issuecomment-679212788 /close This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1230: HDDS-2949: mkdir : store directory entries in a separate table
elek commented on pull request #1230: URL: https://github.com/apache/hadoop-ozone/pull/1230#issuecomment-679209035 /pending review comments are not addressed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1274: HDDS-3810. Add the logic to distribute open containers among the pipelines of a datanode.
elek commented on pull request #1274: URL: https://github.com/apache/hadoop-ozone/pull/1274#issuecomment-679208064 Green build + approval. Will merge this after a new build. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1192: HDDS-3947: Sort DNs for client when the key is a file for #getFileStatus #listStatus APIs
elek commented on pull request #1192: URL: https://github.com/apache/hadoop-ozone/pull/1192#issuecomment-679207273 /pending "@rakeshadr , can you plz add some details here as to why sorting of dns is required ?" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1271: HDDS-4020. ACL commands like getacl and setacl should return a response only when Native Authorizer is enabled.
elek commented on pull request #1271: URL: https://github.com/apache/hadoop-ozone/pull/1271#issuecomment-679206934 /pending Review comments are not addressed + no clean build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #928: [WIP] HDDS-3599. Implement ofs://: Add contract test for HA
elek commented on pull request #928: URL: https://github.com/apache/hadoop-ozone/pull/928#issuecomment-679206516 /pending Draft and no activity This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1232: HDDS-3431. Enable TestContainerReplication test cases
elek commented on pull request #1232: URL: https://github.com/apache/hadoop-ozone/pull/1232#issuecomment-679206176 /pending review comments are not answered This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #970: HDDS-3666. Fix delete container cli failed.
elek commented on pull request #970: URL: https://github.com/apache/hadoop-ozone/pull/970#issuecomment-679205837 What is the state of this PR? If I understood well we need this change but requires a cleanup on this PR... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1181: [POC] HDDS-3915. Simple trash emptier on OM
elek commented on pull request #1181: URL: https://github.com/apache/hadoop-ozone/pull/1181#issuecomment-679204900 /pending draft PR without activity This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1121: HDDS-3432. Enable TestBlockDeletion test cases.
elek commented on pull request #1121: URL: https://github.com/apache/hadoop-ozone/pull/1121#issuecomment-679204445 /pending "I still see 1/20 failure due to timeout at TestBlockDeletion.testBlockDeletion(TestBlockDeletion.java:174)." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1110: HDDS-3843. Throw the specific exception other than NPE.
elek commented on pull request #1110: URL: https://github.com/apache/hadoop-ozone/pull/1110#issuecomment-679204042 /pending @maobaolong Do we have the same error with this patch as AWS? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1337: HDDS-4129. change MAX_QUOTA_IN_BYTES to Long.MAX_VALUE.
adoroszlai commented on a change in pull request #1337: URL: https://github.com/apache/hadoop-ozone/pull/1337#discussion_r475697694 ## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java ## @@ -200,9 +200,9 @@ public static Versioning getVersioning(boolean versioning) { /** - * Max OM Quota size of 1024 PB. + * Max OM Quota size of Long.MAX_VALUE. */ - public static final long MAX_QUOTA_IN_BYTES = 1024L * 1024 * TB; + public static final long MAX_QUOTA_IN_BYTES = Long.MAX_VALUE; Review comment: What about volumes created before this change with the previous max. value? Will those be treated by new version as if quota had been enabled for them? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4139) Update version number in upgrade tests
Attila Doroszlai created HDDS-4139: -- Summary: Update version number in upgrade tests Key: HDDS-4139 URL: https://issues.apache.org/jira/browse/HDDS-4139 Project: Hadoop Distributed Data Store Issue Type: Wish Components: test Affects Versions: 0.6.0 Reporter: Attila Doroszlai Assignee: Attila Doroszlai Ozone 0.6.0 release is renamed to Ozone 1.0.0, but there are a few leftover references to 0.6.0, mostly in {{upgrade}} acceptance test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4120) Implement cleanup service for OM open key table
[ https://issues.apache.org/jira/browse/HDDS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Rose updated HDDS-4120: - Description: Currently, uncommitted keys in the OM open key table remain there until they are committed. A background service should periodically run to remove open keys and their associated blocks from memory if the key is past a certain age. This value will be configurable with the existing ozone.open.key.expire.threshold setting, which currently has a default value of 1 day. Any uncommitted key in the open key table older than this will be marked for deletion, and cleaned up with the existing OM key deleting service. A configurable value should limit the number of open keys that can be removed in one run of the service. [Design Document|https://docs.google.com/document/d/1UgXA27NGBMmTfvrImYgLQtiCfHqbFGDwv0JKv3pJH6E/edit?usp=sharing] was: Currently, uncommitted keys in the OM open key table remain there until they are committed. A background service should periodically run to remove open keys and their associated blocks from memory if the key is past a certain age. We will use a default age of one week (which should be configurable), after which any uncommitted key in the open key table will be marked for deletion, and cleaned up with the existing OM key deleting service. A configurable value should limit the number of open keys that can be removed in one run of the service. [Design Document|https://docs.google.com/document/d/1UgXA27NGBMmTfvrImYgLQtiCfHqbFGDwv0JKv3pJH6E/edit?usp=sharing] > Implement cleanup service for OM open key table > --- > > Key: HDDS-4120 > URL: https://issues.apache.org/jira/browse/HDDS-4120 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: OM HA >Reporter: Ethan Rose >Assignee: Ethan Rose >Priority: Major > > Currently, uncommitted keys in the OM open key table remain there until they > are committed. A background service should periodically run to remove open > keys and their associated blocks from memory if the key is past a certain > age. This value will be configurable with the existing > ozone.open.key.expire.threshold setting, which currently has a default value > of 1 day. Any uncommitted key in the open key table older than this will be > marked for deletion, and cleaned up with the existing OM key deleting > service. A configurable value should limit the number of open keys that can > be removed in one run of the service. > > [Design > Document|https://docs.google.com/document/d/1UgXA27NGBMmTfvrImYgLQtiCfHqbFGDwv0JKv3pJH6E/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4121) Implement OmMetadataMangerImpl#getExpiredOpenKeys
[ https://issues.apache.org/jira/browse/HDDS-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Rose updated HDDS-4121: - Description: Implement the getExpiredOpenKeys method in OmMetadataMangerImpl to return keys in the open key table that are older than a configurable time interval. The method will be modified to take a parameter limiting how many keys are returned. This value will be configurable with the existing ozone.open.key.expire.threshold setting, which currently has a default value of 1 day. (was: Implement the getExpiredOpenKeys method in OmMetadataMangerImpl to return keys in the open key table that are older than a configurable time interval. The method will be modified to take a parameter limiting how many keys are returned. This value will be configurable with the ozone.open.key.purge.limit.per.task setting.) > Implement OmMetadataMangerImpl#getExpiredOpenKeys > - > > Key: HDDS-4121 > URL: https://issues.apache.org/jira/browse/HDDS-4121 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: OM HA >Reporter: Ethan Rose >Assignee: Ethan Rose >Priority: Minor > > Implement the getExpiredOpenKeys method in OmMetadataMangerImpl to return > keys in the open key table that are older than a configurable time interval. > The method will be modified to take a parameter limiting how many keys are > returned. This value will be configurable with the existing > ozone.open.key.expire.threshold setting, which currently has a default value > of 1 day. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183281#comment-17183281 ] Marton Elek commented on HDDS-4097: --- Thanks to explain it [~bharat]. I am closer now, but still have some questions: bq. >> 1. What does it mean from compatibility point of view? Will it work exactly the same way as Amazon S3? Does it mean that we start to support a different semantic when ozone.om.enable.filesystem.paths is turned on? bq. > Yes, when ozone.om.enable.filesystem.paths, paths are treated as filesystem paths, so we check file system semantics and normalize the path. Does it mean that turning on`ozone.om.enable.filesystem.paths` breaking AWS s3 compatibility? bq. (And also planning to make this bucket level property, instead of cluster-wide, not yet finalized) This bucket level settings sounds very cool. bq. Related to 1 + 2. Is it possible to create the intermediate "dir" keys but remove them from the list when listed from S3? bq. Yes, it can be. But right now when this property is enabled, we show all intermediate directories also. Arpit Agarwal brought a point that if we don;t show intermediate keys, and when user tries to create a key with that intermediate path it will fail, and the user will be confused intermediate paths are not shown, and the user is not able to create a key. bq. From usability point of view, we can show intermediate dirs. Do you see any advantage or any other favorable points in hiding those when list operation? We can revisit this if required. I am fine to show them on o3fs/o3/ofs interfaces, but I would prefer to keep the 100% AWS S3 compatibility. If it means that we need to hide the intermediate directories *from s3 output* we might need that change. bq. Not sure, what is meant here. Any more info will help to answer the question. Prefix table effort creates prefixes for each parent directories (AFAIK). Do we need this code after a working prefix table? Will this concept be changed after using prefix table? And one more question: Why do we need that specific settings at all? IF we can provide 100% AWS s3 compatibility with the new approach why is it required to be optional? Do you see any disadvantage of the new approach? Seems to be harder to test both of the approaches... > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4094) Support byte-leve write in Freon HadoopFsGenerator
[ https://issues.apache.org/jira/browse/HDDS-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4094: --- Labels: (was: pull-request-available) > Support byte-leve write in Freon HadoopFsGenerator > -- > > Key: HDDS-4094 > URL: https://issues.apache.org/jira/browse/HDDS-4094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Fix For: 0.7.0 > > > Teragen seems to use the byte level write method of FSDataOutputStream > (write(byte) is used instead of write(byte[], int, int)). > It seems to be a good idea to extend existing `ContentGenerator` of ozone to > support the write in smaller chunks to make it easier to reproduce > performance problems. > Note: statistics from FileSystem instance: > {code} > Closing file system instance: 1257412274 >write.call: 11066 >write.allTime: 215951 >hsync.call: 1 >hsync.allTime: 3 >hflush.call: 0 >hflush.allTime: 0 >close.call: 4 >close.allTime: 62 > {code} > This was a teragen test with 1GB data (and statistics from one container). > write method seems to be called multiple times which means smaller write > buffer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-4094) Support byte-leve write in Freon HadoopFsGenerator
[ https://issues.apache.org/jira/browse/HDDS-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-4094. Fix Version/s: 0.7.0 Resolution: Implemented > Support byte-leve write in Freon HadoopFsGenerator > -- > > Key: HDDS-4094 > URL: https://issues.apache.org/jira/browse/HDDS-4094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.7.0 > > > Teragen seems to use the byte level write method of FSDataOutputStream > (write(byte) is used instead of write(byte[], int, int)). > It seems to be a good idea to extend existing `ContentGenerator` of ozone to > support the write in smaller chunks to make it easier to reproduce > performance problems. > Note: statistics from FileSystem instance: > {code} > Closing file system instance: 1257412274 >write.call: 11066 >write.allTime: 215951 >hsync.call: 1 >hsync.allTime: 3 >hflush.call: 0 >hflush.allTime: 0 >close.call: 4 >close.allTime: 62 > {code} > This was a teragen test with 1GB data (and statistics from one container). > write method seems to be called multiple times which means smaller write > buffer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai merged pull request #1310: HDDS-4094. Support byte-level write in Freon HadoopFsGenerator
adoroszlai merged pull request #1310: URL: https://github.com/apache/hadoop-ozone/pull/1310 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4137) Turn on the verbose mode of safe mode check on testlib
[ https://issues.apache.org/jira/browse/HDDS-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4137: - Labels: pull-request-available (was: ) > Turn on the verbose mode of safe mode check on testlib > -- > > Key: HDDS-4137 > URL: https://issues.apache.org/jira/browse/HDDS-4137 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Affects Versions: 0.6.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4118) Allow running Kubernetes example tests on k3d
[ https://issues.apache.org/jira/browse/HDDS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4118: --- Resolution: Not A Problem Status: Resolved (was: Patch Available) > Allow running Kubernetes example tests on k3d > - > > Key: HDDS-4118 > URL: https://issues.apache.org/jira/browse/HDDS-4118 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > > Kubernetes example tests currently only run on local cluster. With a minor > change we can make the tests work on k3d, too. > bq. [k3d|https://k3d.io/] makes it very easy to create single- and multi-node > k3s clusters in docker, e.g. for local development on Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] maobaolong opened a new pull request #1343: HDDS-4137. Turn on the verbose mode of safe mode check on testlib
maobaolong opened a new pull request #1343: URL: https://github.com/apache/hadoop-ozone/pull/1343 ## What changes were proposed in this pull request? Turn on the verbose mode of safe mode check on testlib ## What is the link to the Apache JIRA HDDS-4137 ## How was this patch tested? No need This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4135) In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
[ https://issues.apache.org/jira/browse/HDDS-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4135: - Labels: pull-request-available (was: ) > In ContainerStateManagerV2, modification of RocksDB should be consistent with > that of memory state. > --- > > Key: HDDS-4135 > URL: https://issues.apache.org/jira/browse/HDDS-4135 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > Labels: pull-request-available > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] GlenGeng opened a new pull request #1342: HDDS-4135: In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
GlenGeng opened a new pull request #1342: URL: https://github.com/apache/hadoop-ozone/pull/1342 ## What changes were proposed in this pull request? Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4135 Please replace this section with the link to the Apache JIRA) CI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM
adoroszlai commented on a change in pull request #1339: URL: https://github.com/apache/hadoop-ozone/pull/1339#discussion_r475577292 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java ## @@ -103,14 +108,44 @@ private void updateContainerStats(final ContainerID containerId, containerInfo.updateSequenceId( replicaProto.getBlockCommitSequenceId()); } + List otherReplicas = + getOtherReplicas(containerId, datanodeDetails); Review comment: Good point, I missed that. Thanks for the explanation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM
adoroszlai commented on a change in pull request #1339: URL: https://github.com/apache/hadoop-ozone/pull/1339#discussion_r475523841 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java ## @@ -103,13 +108,44 @@ private void updateContainerStats(final ContainerID containerId, containerInfo.updateSequenceId( replicaProto.getBlockCommitSequenceId()); } - if (containerInfo.getUsedBytes() != replicaProto.getUsed()) { -containerInfo.setUsedBytes(replicaProto.getUsed()); + List otherReplicas = + getOtherReplicas(containerId, datanodeDetails); Review comment: I think we can skip filtering the replicas for "others", since the resulting count/bytes values will be the min/max of all replicas, including this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers ? We have to ensure Atomicity of ACID for state update: If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged. No partial change is allowed so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers ? We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any > follower SCM fails due to db exception, what can we do to ensure that states > of leader and followers won't diverge, a.k.a. ensure the replicated > StateMachine for leader and followers ? > We have to ensure Atomicity of ACID for state update: If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged. No partial change is allowed so that leader SCM can safely > revert the state change for the whole raft groups. > Above analysis also applies to pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers ? We have to ensure Atomicity of ACID for state update: If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged. No partial change is allowed so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and other issues besides disk failure. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers ? We have to ensure Atomicity of ACID for state update: If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged. No partial change is allowed so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any > follower SCM fails due to db exception, what can we do to ensure that states > of leader and followers won't diverge, a.k.a. ensure the replicated > StateMachine for leader and followers ? > We have to ensure Atomicity of ACID for state update: If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged. No partial change is allowed so that leader SCM can safely > revert the state change for the whole raft groups. > Above analysis also applies to pipeline V2 and other issues besides disk > failure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers ? We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any > follower SCM fails due to db exception, what can we do to ensure that states > of leader and followers won't diverge, a.k.a. ensure the replicated > StateMachine for leader and followers ? > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies to pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any follower SCM fails due to db exception, what can we do to ensure that states of leader and followers won't diverge, a.k.a. ensure the replicated StateMachine for leader and followers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if leader SCM succeed the operation, meanwhile any > follower SCM fails due to db exception, what can we do to ensure that states > of leader and followers won't diverge, a.k.a. ensure the replicated > StateMachine for leader and followers. > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies to pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM
sodonnel commented on a change in pull request #1339: URL: https://github.com/apache/hadoop-ozone/pull/1339#discussion_r475548705 ## File path: hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerReportHandler.java ## @@ -483,9 +485,167 @@ public void testQuasiClosedToClosed() Assert.assertEquals(LifeCycleState.CLOSED, containerOne.getState()); } + @Test + public void openContainerKeyAndBytesUsedUpdatedToMinimumOfAllReplicas() + throws SCMException { +final ContainerReportHandler reportHandler = new ContainerReportHandler( +nodeManager, containerManager); +final Iterator nodeIterator = nodeManager.getNodes( +NodeState.HEALTHY).iterator(); + +final DatanodeDetails datanodeOne = nodeIterator.next(); +final DatanodeDetails datanodeTwo = nodeIterator.next(); +final DatanodeDetails datanodeThree = nodeIterator.next(); + +final ContainerReplicaProto.State replicaState += ContainerReplicaProto.State.OPEN; +final ContainerInfo containerOne = getContainer(LifeCycleState.OPEN); + +final Set containerIDSet = new HashSet<>(); +containerIDSet.add(containerOne.containerID()); + +containerStateManager.loadContainer(containerOne); +// Container loaded, no replicas reported from DNs. Expect zeros for +// usage values. +assertEquals(0L, containerOne.getUsedBytes()); +assertEquals(0L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 50L, 60L), publisher); + +// Single replica reported - ensure values are updated +assertEquals(50L, containerOne.getUsedBytes()); +assertEquals(60L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeTwo, 50L, 60L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeThree, 50L, 60L), publisher); + +// All 3 DNs are reporting the same values. Counts should be as expected. +assertEquals(50L, containerOne.getUsedBytes()); +assertEquals(60L, containerOne.getNumberOfKeys()); + +// Now each DN reports a different lesser value. Counts should be the min +// reported. +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 1L, 10L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeTwo, 2L, 11L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeThree, 3L, 12L), publisher); + +// All 3 DNs are reporting different values. The actual value should be the +// minimum. +assertEquals(1L, containerOne.getUsedBytes()); +assertEquals(10L, containerOne.getNumberOfKeys()); + +// Have the lowest value report a higher value and ensure the new value +// is the minimum +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 3L, 12L), publisher); + +assertEquals(2L, containerOne.getUsedBytes()); +assertEquals(11L, containerOne.getNumberOfKeys()); + } + + @Test + public void notOpenContainerKeyAndBytesUsedUpdatedToMaximumOfAllReplicas() + throws SCMException { +final ContainerReportHandler reportHandler = new ContainerReportHandler( +nodeManager, containerManager); +final Iterator nodeIterator = nodeManager.getNodes( +NodeState.HEALTHY).iterator(); + +final DatanodeDetails datanodeOne = nodeIterator.next(); +final DatanodeDetails datanodeTwo = nodeIterator.next(); +final DatanodeDetails datanodeThree = nodeIterator.next(); + +final ContainerReplicaProto.State replicaState += ContainerReplicaProto.State.CLOSED; +final ContainerInfo containerOne = getContainer(LifeCycleState.CLOSED); + +final Set containerIDSet = new HashSet<>(); +containerIDSet.add(containerOne.containerID()); + +containerStateManager.loadContainer(containerOne); +// Container loaded, no replicas reported from DNs. Expect zeros for +// usage values. +assertEquals(0L, containerOne.getUsedBytes()); +assertEquals(0L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 50L, 60L), publisher); + +// Single replica reported - ensure values are updated +assertEquals(50L, containerOne.getUsedBytes()); +assertEquals(60L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeTwo,
[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM
sodonnel commented on a change in pull request #1339: URL: https://github.com/apache/hadoop-ozone/pull/1339#discussion_r475547913 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java ## @@ -103,14 +108,44 @@ private void updateContainerStats(final ContainerID containerId, containerInfo.updateSequenceId( replicaProto.getBlockCommitSequenceId()); } + List otherReplicas = + getOtherReplicas(containerId, datanodeDetails); Review comment: I think we still do need to filter out the current replica. The reason is that at the point the stats are updated, the newly reported replica has not yet updated the previous values stored for it in SCM memory. This is due to the order of operations here, in AbstractContainerReportHandler: ``` protected void processContainerReplica(final DatanodeDetails datanodeDetails, final ContainerReplicaProto replicaProto) throws IOException { final ContainerID containerId = ContainerID .valueof(replicaProto.getContainerID()); if (logger.isDebugEnabled()) { logger.debug("Processing replica of container {} from datanode {}", containerId, datanodeDetails); } // Synchronized block should be replaced by container lock, // once we have introduced lock inside ContainerInfo. synchronized (containerManager.getContainer(containerId)) { updateContainerStats(datanodeDetails, containerId, replicaProto); updateContainerState(datanodeDetails, containerId, replicaProto); updateContainerReplica(datanodeDetails, containerId, replicaProto); } } ``` The in-memory values for the reported replicas are not updated until the updateContainerReplica(...) call, but we updateContainerStats(...) before that. For example, lets say we have an CLOSED container, so its the max of the values we use. Existing replicas have values 4, 4, 5, so the value used is 5. The new report changes the 5 to 4, so now we would have: 4, 4, 5 (stale value, will be updated soon), 4(new value of stale value). If we don't filter out the stale replica, then the value will be wrong and left as 5 until the next container report is processed, when it would change to 4. Does that make sense? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4138) Improve crc efficiency
[ https://issues.apache.org/jira/browse/HDDS-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4138: - Description: HADOOP has implemented several method to calculate crc: https://issues.apache.org/jira/browse/HADOOP-15033 We should choose the method with high efficiency. This flame graph is from [~elek] !screenshot-1.png! was: HADOOP has implemented several method to calculate crc: https://issues.apache.org/jira/browse/HADOOP-15033 We should choose the method with high efficiency. > Improve crc efficiency > -- > > Key: HDDS-4138 > URL: https://issues.apache.org/jira/browse/HDDS-4138 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png > > > HADOOP has implemented several method to calculate crc: > https://issues.apache.org/jira/browse/HADOOP-15033 > We should choose the method with high efficiency. > This flame graph is from [~elek] > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4138) Improve crc efficiency
[ https://issues.apache.org/jira/browse/HDDS-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4138: - Attachment: screenshot-1.png > Improve crc efficiency > -- > > Key: HDDS-4138 > URL: https://issues.apache.org/jira/browse/HDDS-4138 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png > > > HADOOP has implemented several method to calculate crc: > https://issues.apache.org/jira/browse/HADOOP-15033 > We should choose the method with high efficiency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4138) Improve crc efficiency
[ https://issues.apache.org/jira/browse/HDDS-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4138: - Description: HADOOP has implemented several method to calculate crc: https://issues.apache.org/jira/browse/HADOOP-15033 We should choose the method with high efficiency. was: HADOOP has implemented several method to calculate crc: https://issues.apache.org/jira/browse/HADOOP-15033 We should choose the method with high efficiency. > Improve crc efficiency > -- > > Key: HDDS-4138 > URL: https://issues.apache.org/jira/browse/HDDS-4138 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > > HADOOP has implemented several method to calculate crc: > https://issues.apache.org/jira/browse/HADOOP-15033 > We should choose the method with high efficiency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4138) Improve crc efficiency
[ https://issues.apache.org/jira/browse/HDDS-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4138: - Description: HADOOP has implemented several method to calculate crc: https://issues.apache.org/jira/browse/HADOOP-15033 We should choose the method with high efficiency. > Improve crc efficiency > -- > > Key: HDDS-4138 > URL: https://issues.apache.org/jira/browse/HDDS-4138 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > > HADOOP has implemented several method to calculate crc: > https://issues.apache.org/jira/browse/HADOOP-15033 > We should choose the method with high efficiency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4138) Improve crc efficiency
runzhiwang created HDDS-4138: Summary: Improve crc efficiency Key: HDDS-4138 URL: https://issues.apache.org/jira/browse/HDDS-4138 Project: Hadoop Distributed Data Store Issue Type: Task Reporter: runzhiwang Assignee: runzhiwang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1339: HDDS-4131. Container report should update container key count and bytes used if they differ in SCM
adoroszlai commented on a change in pull request #1339: URL: https://github.com/apache/hadoop-ozone/pull/1339#discussion_r475524057 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java ## @@ -103,14 +108,44 @@ private void updateContainerStats(final ContainerID containerId, containerInfo.updateSequenceId( replicaProto.getBlockCommitSequenceId()); } + List otherReplicas = + getOtherReplicas(containerId, datanodeDetails); Review comment: I think we can skip filtering the replicas for "others" (and omit the code for that), since the resulting count/bytes values are the min/max of all replicas, including this one, either way. If we loop over all replicas, the iteration for the current replica will be a "no-op", not change the result. ## File path: hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerReportHandler.java ## @@ -483,9 +485,167 @@ public void testQuasiClosedToClosed() Assert.assertEquals(LifeCycleState.CLOSED, containerOne.getState()); } + @Test + public void openContainerKeyAndBytesUsedUpdatedToMinimumOfAllReplicas() + throws SCMException { +final ContainerReportHandler reportHandler = new ContainerReportHandler( +nodeManager, containerManager); +final Iterator nodeIterator = nodeManager.getNodes( +NodeState.HEALTHY).iterator(); + +final DatanodeDetails datanodeOne = nodeIterator.next(); +final DatanodeDetails datanodeTwo = nodeIterator.next(); +final DatanodeDetails datanodeThree = nodeIterator.next(); + +final ContainerReplicaProto.State replicaState += ContainerReplicaProto.State.OPEN; +final ContainerInfo containerOne = getContainer(LifeCycleState.OPEN); + +final Set containerIDSet = new HashSet<>(); +containerIDSet.add(containerOne.containerID()); + +containerStateManager.loadContainer(containerOne); +// Container loaded, no replicas reported from DNs. Expect zeros for +// usage values. +assertEquals(0L, containerOne.getUsedBytes()); +assertEquals(0L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 50L, 60L), publisher); + +// Single replica reported - ensure values are updated +assertEquals(50L, containerOne.getUsedBytes()); +assertEquals(60L, containerOne.getNumberOfKeys()); + +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeTwo, 50L, 60L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeThree, 50L, 60L), publisher); + +// All 3 DNs are reporting the same values. Counts should be as expected. +assertEquals(50L, containerOne.getUsedBytes()); +assertEquals(60L, containerOne.getNumberOfKeys()); + +// Now each DN reports a different lesser value. Counts should be the min +// reported. +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 1L, 10L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeTwo, 2L, 11L), publisher); +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeThree, 3L, 12L), publisher); + +// All 3 DNs are reporting different values. The actual value should be the +// minimum. +assertEquals(1L, containerOne.getUsedBytes()); +assertEquals(10L, containerOne.getNumberOfKeys()); + +// Have the lowest value report a higher value and ensure the new value +// is the minimum +reportHandler.onMessage(getContainerReportFromDatanode( +containerOne.containerID(), replicaState, +datanodeOne, 3L, 12L), publisher); + +assertEquals(2L, containerOne.getUsedBytes()); +assertEquals(11L, containerOne.getNumberOfKeys()); + } + + @Test + public void notOpenContainerKeyAndBytesUsedUpdatedToMaximumOfAllReplicas() + throws SCMException { +final ContainerReportHandler reportHandler = new ContainerReportHandler( +nodeManager, containerManager); +final Iterator nodeIterator = nodeManager.getNodes( +NodeState.HEALTHY).iterator(); + +final DatanodeDetails datanodeOne = nodeIterator.next(); +final DatanodeDetails datanodeTwo = nodeIterator.next(); +final DatanodeDetails datanodeThree = nodeIterator.next(); + +final ContainerReplicaProto.State replicaState += ContainerReplicaProto.State.CLOSED; +final ContainerInfo containerOne = getContainer(LifeCycleState.CLOSED); + +final Set containerIDSet = new HashSet<>(); +
[GitHub] [hadoop-ozone] captainzmc commented on pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.
captainzmc commented on pull request #1296: URL: https://github.com/apache/hadoop-ozone/pull/1296#issuecomment-679067616 @ChenSammi CC This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4137) Turn on the verbose mode of safe mode check on testlib
maobaolong created HDDS-4137: Summary: Turn on the verbose mode of safe mode check on testlib Key: HDDS-4137 URL: https://issues.apache.org/jira/browse/HDDS-4137 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: test Affects Versions: 0.6.0 Reporter: maobaolong Assignee: maobaolong -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies to pipeline V2 and etc. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any > Follower SCM fails due to db exception, what can we do to ensure that states > of leader and follower won't diverge, a.k.a., ensure the replicated state > machine for leader and folowers. > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies to pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1336: HDDS-4119. Improve performance of the BufferPool management of Ozone client
adoroszlai commented on a change in pull request #1336: URL: https://github.com/apache/hadoop-ozone/pull/1336#discussion_r475439432 ## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBuffer.java ## @@ -88,6 +86,12 @@ default ChunkBuffer put(byte[] b) { return put(ByteBuffer.wrap(b)); } + /** Similar to {@link ByteBuffer#put(byte[])}. */ + default ChunkBuffer put(byte b) { +byte[] buf = new byte[1]; +buf[0] = (byte) b; +return put(buf, 0, 1); } Review comment: ```suggestion return put(buf, 0, 1); } ``` ## File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java ## @@ -154,6 +159,16 @@ public BlockOutputStream(BlockID blockID, this.bufferPool = bufferPool; this.bytesPerChecksum = bytesPerChecksum; +//number of buffers used before doing a flush +currentBuffer = bufferPool.getCurrentBuffer(); +currentBufferRemaining = +currentBuffer != null ? currentBuffer.remaining() : 0; Review comment: Can you please extract these 2 lines to a separate method (and replace other 2 duplicate fragments, too)? ## File path: hadoop-hdds/client/src/test/java/org/apache/hadoop/hdds/scm/storage/TestBlockOutputStreamCorrectness.java ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hdds.scm.storage; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicInteger; + +import org.apache.hadoop.hdds.client.BlockID; +import org.apache.hadoop.hdds.protocol.DatanodeDetails; +import org.apache.hadoop.hdds.protocol.MockDatanodeDetails; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ChecksumType; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ContainerCommandRequestProto; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ContainerCommandResponseProto; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.GetCommittedBlockLengthResponseProto; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.PutBlockResponseProto; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.Result; +import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.Type; +import org.apache.hadoop.hdds.protocol.proto.HddsProtos.ReplicationFactor; +import org.apache.hadoop.hdds.protocol.proto.HddsProtos.ReplicationType; +import org.apache.hadoop.hdds.scm.XceiverClientManager; +import org.apache.hadoop.hdds.scm.XceiverClientReply; +import org.apache.hadoop.hdds.scm.XceiverClientSpi; +import org.apache.hadoop.hdds.scm.pipeline.Pipeline; +import org.apache.hadoop.hdds.scm.pipeline.Pipeline.Builder; +import org.apache.hadoop.hdds.scm.pipeline.Pipeline.PipelineState; +import org.apache.hadoop.hdds.scm.pipeline.PipelineID; + +import org.apache.ratis.thirdparty.com.google.protobuf.ByteString; +import org.jetbrains.annotations.NotNull; +import org.junit.Assert; +import org.junit.Test; +import org.mockito.Mockito; + +/** + * UNIT test for BlockOutputStream. + * + * Compares bytes written to the stream and received in the ChunkWriteRequests. + */ +public class TestBlockOutputStreamCorrectness { + + private static final long SEED = 18480315L; Review comment: ## File path: hadoop-hdds/client/src/test/java/org/apache/hadoop/hdds/scm/storage/TestBlockOutputStreamCorrectness.java ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + *
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state changes if meet IOException for db operations. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. was: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state change if meet IOException for db operation. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state changes if meet > IOException for db operations. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any > Follower SCM fails due to db exception, what can we do to ensure that states > of leader and follower won't diverge, a.k.a., ensure the replicated state > machine for leader and folowers. > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies ot pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handle exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state change if meet IOException for db operation. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. was: I have a concern about how to handling exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state change if meet IOException for db operation. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handle exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state change if meet > IOException for db operation. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any > Follower SCM fails due to db exception, what can we do to ensure that states > of leader and follower won't diverge, a.k.a., ensure the replicated state > machine for leader and folowers. > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies ot pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: (was: Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification.) > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Description: I have a concern about how to handling exceptions occurred in writing RocksDB for container V2, such as allocateContainer, deleteContainer and updateContainerState. For non-HA case, allocateContainer reverts the memory state change if meet IOException for db operation. deleteContainer and updateContainerState just throw out the IOException and leave the memory state in an inconsistency state. After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any Follower SCM fails due to db exception, what can we do to ensure that states of leader and follower won't diverge, a.k.a., ensure the replicated state machine for leader and folowers. We have to ensure Atomicity of ACID for state update. If any exception occurred, SCM (no matter leader or follower) should throw exception and keep states unchanged, so that leader SCM can safely revert the state change for the whole raft groups. Above analysis also applies ot pipeline V2 and etc. > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > I have a concern about how to handling exceptions occurred in writing RocksDB > for container V2, such as allocateContainer, deleteContainer and > updateContainerState. > For non-HA case, allocateContainer reverts the memory state change if meet > IOException for db operation. deleteContainer and updateContainerState just > throw out the IOException and leave the memory state in an inconsistency > state. > After we enable SCM-HA, if Leader SCM succeed the operation, meanwhile any > Follower SCM fails due to db exception, what can we do to ensure that states > of leader and follower won't diverge, a.k.a., ensure the replicated state > machine for leader and folowers. > We have to ensure Atomicity of ACID for state update. If any exception > occurred, SCM (no matter leader or follower) should throw exception and keep > states unchanged, so that leader SCM can safely revert the state change for > the whole raft groups. > Above analysis also applies ot pipeline V2 and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state update for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Summary: Design for Error/Exception handling in state update for container/pipeline V2 (was: Design for Error/Exception handling in state updates for container/pipeline V2) > Design for Error/Exception handling in state update for container/pipeline V2 > - > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4136) Design for Error/Exception handling in state updates for container/pipeline V2
[ https://issues.apache.org/jira/browse/HDDS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4136: Summary: Design for Error/Exception handling in state updates for container/pipeline V2 (was: CLONE - In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.) > Design for Error/Exception handling in state updates for container/pipeline V2 > -- > > Key: HDDS-4136 > URL: https://issues.apache.org/jira/browse/HDDS-4136 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4136) CLONE - In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
Glen Geng created HDDS-4136: --- Summary: CLONE - In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state. Key: HDDS-4136 URL: https://issues.apache.org/jira/browse/HDDS-4136 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: SCM Reporter: Glen Geng Assignee: Glen Geng Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] maobaolong commented on pull request #1320: HDDS-4111. Keep the CSI.zh.md consistent with CSI.md
maobaolong commented on pull request #1320: URL: https://github.com/apache/hadoop-ozone/pull/1320#issuecomment-678990048 @runitao Thanks for the +1. @cxorm Please take a look at this CSI.zh.md update. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] maobaolong commented on pull request #984: HDDS-3654. Let backgroundCreator create pipeline for the support replication factors alternately
maobaolong commented on pull request #984: URL: https://github.com/apache/hadoop-ozone/pull/984#issuecomment-678989162 @elek Thanks you for approve this PR, now it passed all CI checks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] captainzmc closed pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.
captainzmc closed pull request #1296: URL: https://github.com/apache/hadoop-ozone/pull/1296 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4077) Incomplete OzoneFileSystem statistics
[ https://issues.apache.org/jira/browse/HDDS-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4077: --- Status: Patch Available (was: In Progress) > Incomplete OzoneFileSystem statistics > - > > Key: HDDS-4077 > URL: https://issues.apache.org/jira/browse/HDDS-4077 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > > OzoneFileSystem does not record some of the operations that are defined in > [Statistic|https://github.com/apache/hadoop-ozone/blob/d7ea4966656cfdb0b53a368eac52d71adb717104/hadoop-ozone/ozonefs-common/src/main/java/org/apache/hadoop/fs/ozone/Statistic.java#L44-L75]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] maobaolong closed pull request #984: HDDS-3654. Let backgroundCreator create pipeline for the support replication factors alternately
maobaolong closed pull request #984: URL: https://github.com/apache/hadoop-ozone/pull/984 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4135) In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
[ https://issues.apache.org/jira/browse/HDDS-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4135: Summary: In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state. (was: In ContainerStateManagerV2, modification of RocksDB should be in consistency with that of memory state.) > In ContainerStateManagerV2, modification of RocksDB should be consistent with > that of memory state. > --- > > Key: HDDS-4135 > URL: https://issues.apache.org/jira/browse/HDDS-4135 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > Labels: pull-request-available > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4135) In ContainerStateManagerV2, modification of RocksDB should be in consistency with that of memory state.
[ https://issues.apache.org/jira/browse/HDDS-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4135: Description: Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification. was: Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > In ContainerStateManagerV2, modification of RocksDB should be in consistency > with that of memory state. > --- > > Key: HDDS-4135 > URL: https://issues.apache.org/jira/browse/HDDS-4135 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > Labels: pull-request-available > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4135) In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
[ https://issues.apache.org/jira/browse/HDDS-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4135: Labels: (was: pull-request-available) > In ContainerStateManagerV2, modification of RocksDB should be consistent with > that of memory state. > --- > > Key: HDDS-4135 > URL: https://issues.apache.org/jira/browse/HDDS-4135 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4135) In ContainerStateManagerV2, modification of RocksDB should be consistent with that of memory state.
[ https://issues.apache.org/jira/browse/HDDS-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glen Geng updated HDDS-4135: Description: Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification. was: Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 In ContainerStateManagerV2, both disk state (column families in RocksDB) and memory state (container maps in memory) are protected by raft, and should keep their consistency upon each modification. > In ContainerStateManagerV2, modification of RocksDB should be consistent with > that of memory state. > --- > > Key: HDDS-4135 > URL: https://issues.apache.org/jira/browse/HDDS-4135 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Glen Geng >Assignee: Glen Geng >Priority: Major > > > Fix a bug in https://issues.apache.org/jira/browse/HDDS-3895 > In ContainerStateManagerV2, both disk state (column families in RocksDB) and > memory state (container maps in memory) are protected by raft, and should > keep their consistency upon each modification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org