date:20200326

[jira] [Resolved] (HDDS-3074) Make the configuration of container scrub consistent

2020-03-26 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-3074.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

> Make the configuration of container scrub consistent
> 
>
> Key: HDDS-3074
> URL: https://issues.apache.org/jira/browse/HDDS-3074
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: YiSheng Lien
>Assignee: Neo Yang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The prefix of configuration of container scrub in 
> [ozone-site.xml|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/common/src/main/resources/ozone-default.xml]
>  is *hdds.container.scrub*, but in 
> [ContainerScrubberConfiguration|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java]
>  is *hdds.containerscrub*.
> This situation would lead to not work under configuration.
>  For example, when we set *hdds.container.scrub.enabled* true the cluster 
> didn't work on container scrub, and if we set *hdds.containerscrub.enable* 
> true, it did work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #722: HDDS-3074. Make the configuration of container scrub consistent.

2020-03-26 Thread GitBox

bharatviswa504 commented on issue #722: HDDS-3074. Make the configuration of 
container scrub consistent.
URL: https://github.com/apache/hadoop-ozone/pull/722#issuecomment-604820134
 
 
   Thank You @cku328 for the contribution.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #722: HDDS-3074. Make the configuration of container scrub consistent.

2020-03-26 Thread GitBox

bharatviswa504 commented on issue #722: HDDS-3074. Make the configuration of 
container scrub consistent.
URL: https://github.com/apache/hadoop-ozone/pull/722#issuecomment-604819927
 
 
   Test failures are unrelated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 merged pull request #722: HDDS-3074. Make the configuration of container scrub consistent.

2020-03-26 Thread GitBox

bharatviswa504 merged pull request #722: HDDS-3074. Make the configuration of 
container scrub consistent.
URL: https://github.com/apache/hadoop-ozone/pull/722
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #727: HDDS-3273. getConf does not return all OM addresses.

2020-03-26 Thread GitBox

bharatviswa504 commented on a change in pull request #727: HDDS-3273. getConf 
does not return all OM addresses.
URL: https://github.com/apache/hadoop-ozone/pull/727#discussion_r399039110
 
 

 ##
 File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
 ##
 @@ -89,6 +93,31 @@ public static InetSocketAddress getOmAddress(Configuration 
conf) {
 return NetUtils.createSocketAddr(getOmRpcAddress(conf));
   }
 
+  /**
+   * Return list of OM addresses by service ids - when HA is enabled.
+   *
+   * @param conf {@link Configuration}
+   * @return {service.id -> [{@link InetSocketAddress}]}
+   */
+  public static Map> getOmHAAddressesById(
+  Configuration conf) {
+Map> result = new HashMap<>();
+for (String serviceId : conf.getTrimmedStringCollection(
+OZONE_OM_SERVICE_IDS_KEY)) {
+  if (!result.containsKey(serviceId)) {
+result.put(serviceId, new ArrayList<>());
+  }
+  for (String nodeId : getOMNodeIds(conf, serviceId)) {
+String rpcAddr = getOmRpcAddress(conf,
+addKeySuffixes(OZONE_OM_ADDRESS_KEY, serviceId, nodeId));
+if (rpcAddr != null) {
 
 Review comment:
   One minor comment. when for one of nodeId is undefined, do we want to print 
"unknown address" instead of silently ignoring?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #727: HDDS-3273. getConf does not return all OM addresses.

2020-03-26 Thread GitBox

bharatviswa504 commented on a change in pull request #727: HDDS-3273. getConf 
does not return all OM addresses.
URL: https://github.com/apache/hadoop-ozone/pull/727#discussion_r399039110
 
 

 ##
 File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
 ##
 @@ -89,6 +93,31 @@ public static InetSocketAddress getOmAddress(Configuration 
conf) {
 return NetUtils.createSocketAddr(getOmRpcAddress(conf));
   }
 
+  /**
+   * Return list of OM addresses by service ids - when HA is enabled.
+   *
+   * @param conf {@link Configuration}
+   * @return {service.id -> [{@link InetSocketAddress}]}
+   */
+  public static Map> getOmHAAddressesById(
+  Configuration conf) {
+Map> result = new HashMap<>();
+for (String serviceId : conf.getTrimmedStringCollection(
+OZONE_OM_SERVICE_IDS_KEY)) {
+  if (!result.containsKey(serviceId)) {
+result.put(serviceId, new ArrayList<>());
+  }
+  for (String nodeId : getOMNodeIds(conf, serviceId)) {
+String rpcAddr = getOmRpcAddress(conf,
+addKeySuffixes(OZONE_OM_ADDRESS_KEY, serviceId, nodeId));
+if (rpcAddr != null) {
 
 Review comment:
   One minor comment. when for one of nodeId address is undefined, do we want 
to print "unknown address" instead of silently ignoring?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3286) Support batchDelete when deleting path.

2020-03-26 Thread mingchao zhao (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao updated HDDS-3286:

Description: 
Currently delete file is to get all the keys in the directory, and then delete 
one by one. This makes for poor performance. By tested the deletion path with 
100,000 files, which took 7320.964 sec.

We plan to change this part to a batch operation to improve performance.

  was:
Currently delete file is to get all the keys in the directory, and then delete 
one by one. This makes for poor performance. By tested the deletion path with 
100,000 keys, which took 7320.964 sec.

We plan to change this part to a batch operation to improve performance.


> Support batchDelete when deleting path.
> ---
>
> Key: HDDS-3286
> URL: https://issues.apache.org/jira/browse/HDDS-3286
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Filesystem
>Reporter: mingchao zhao
>Priority: Major
>
> Currently delete file is to get all the keys in the directory, and then 
> delete one by one. This makes for poor performance. By tested the deletion 
> path with 100,000 files, which took 7320.964 sec.
> We plan to change this part to a batch operation to improve performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3286) Support batchDelete when deleting path.

2020-03-26 Thread mingchao zhao (Jira)

mingchao zhao created HDDS-3286:
---

 Summary: Support batchDelete when deleting path.
 Key: HDDS-3286
 URL: https://issues.apache.org/jira/browse/HDDS-3286
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Filesystem
Reporter: mingchao zhao


Currently delete file is to get all the keys in the directory, and then delete 
one by one. This makes for poor performance. By tested the deletion path with 
100,000 keys, which took 7320.964 sec.

We plan to change this part to a batch operation to improve performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-3271) The block file is not deleted after the key is deleted

2020-03-26 Thread mingchao zhao (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao resolved HDDS-3271.
-
Resolution: Not A Problem

> The block file is not deleted after the key is deleted
> --
>
> Key: HDDS-3271
> URL: https://issues.apache.org/jira/browse/HDDS-3271
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-03-25-11-41-26-972.png
>
>
> When I successfully deleted the key, I was still able to see the block file 
> in the chunk directory. Block files are not deleted altogether.
> !image-2020-03-25-11-41-26-972.png|width=1169,height=143!
> This may be an existing bug, and I will confirm the reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] swagle commented on issue #718: HDDS-3224. Enforce volume and bucket name rule at create time.

2020-03-26 Thread GitBox

swagle commented on issue #718: HDDS-3224. Enforce volume and bucket name rule 
at create time.
URL: https://github.com/apache/hadoop-ozone/pull/718#issuecomment-604776938
 
 
   S3BucketCreateRequest checked for length but the verifyResourceName is more 
comprehensive.
   We are still doing length check in S3BucketDeleteRequest, technically we 
should not need validation in the delete path but kept those changes asis.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3273) OM HA: getconf must return all OMs

2020-03-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3273:
-
Labels: pull-request-available  (was: )

> OM HA: getconf must return all OMs
> --
>
> Key: HDDS-3273
> URL: https://issues.apache.org/jira/browse/HDDS-3273
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Tools
>Affects Versions: 0.5.0
>Reporter: Dinesh Chitlangia
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
>
> Discovered by [~xyao] when testing 0.5.0-beta rc2:
>  
> ozone getconf -ozonemanagers does not return all the om instances
> bash-4.2$ ozone getconf -ozonemanagers
> 0.0.0.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #625: HDDS-2980. Delete replayed entry from OpenKeyTable during commit

2020-03-26 Thread GitBox

bharatviswa504 commented on a change in pull request #625: HDDS-2980. Delete 
replayed entry from OpenKeyTable during commit
URL: https://github.com/apache/hadoop-ozone/pull/625#discussion_r398972414
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3MultipartUploadCommitPartRequest.java
 ##
 @@ -147,11 +146,6 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
 throw new OMException("Failed to commit Multipart Upload key, as " +
 openKey + "entry is not found in the openKey table",
 KEY_NOT_FOUND);
-  } else {
-// Check the OpenKeyTable if this transaction is a replay of ratis 
logs.
 
 Review comment:
   Not understood why this check is removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

adoroszlai commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to 
lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725#issuecomment-604678295
 
 
   Thanks @sodonnel and @vivekratnavel for the review, and @hanishakoneru for 
merging it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

2020-03-26 Thread GitBox

hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot 
tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-604676324
 
 
   @adoroszlai, I think even with that limitation the timeout will help us 
isolate the problem. Let's say the acceptance suit is cancelled, we could still 
get to know which test contributed to the time out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru merged pull request #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

hanishakoneru merged pull request #725: HDDS-3284. ozonesecure-mr test fails 
due to lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

hanishakoneru commented on issue #725: HDDS-3284. ozonesecure-mr test fails due 
to lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725#issuecomment-604623673
 
 
   Thanks @adoroszlai for working on this. I will go ahead and merge this as 
the change is only in acceptance tests.
   Thanks @sodonnel and @vivekratnavel for the reviews.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] esahekmat opened a new pull request #726: HDDS-3267. replace containerCache in blockUtils by LoadingCache

2020-03-26 Thread GitBox

esahekmat opened a new pull request #726: HDDS-3267. replace containerCache in 
blockUtils by LoadingCache
URL: https://github.com/apache/hadoop-ozone/pull/726
 
 
   ## What changes were proposed in this pull request?
   ContainerCache removed and instead of it, LoadingCache replaced in 
BlockUtils,
   The count in ReferenceCountedDB removed and this class renamed to ReferenceDB
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-3267
   
   ## How was this patch tested?
   
   mvn clean package
   acceptance.sh
   checkstyle.sh


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3267) Replace ContainerCache in BlockUtils by LoadingCache

2020-03-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3267:
-
Labels: pull-request-available  (was: )

> Replace ContainerCache in BlockUtils by LoadingCache
> 
>
> Key: HDDS-3267
> URL: https://issues.apache.org/jira/browse/HDDS-3267
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Isa Hekmatizadeh
>Assignee: Isa Hekmatizadeh
>Priority: Minor
>  Labels: pull-request-available
>
> As discussed in [here|https://github.com/apache/hadoop-ozone/pull/705] 
> current version of ContainerCache is just used by BlockUtils and has several 
> architectural issues. for example:
>  * It uses a ReentrantLock which could be replaced by synchronized methods
>  * It should maintain a referenceCount for each DBHandler
>  * It extends LRUMap while it would be better to hide it by the composition 
> and not expose LRUMap related methods.
> As [~pifta] suggests, we could replace all ContainerCache functionality by 
> using Guava LoadingCache.
> This new LoadingCache could be configured to evict by size, by this 
> configuration the functionality would be slightly different as it may evict 
> DBHandlers while they are in use (referenceCount>0) but we can configure it 
> to use reference base eviction based on CacheBuilder.weakValues() 
> I want to open this discussion here instead of Github so I created this 
> ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3088) maxRetries value is too large while trying to reconnect to SCM server

2020-03-26 Thread Arpit Agarwal (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067895#comment-17067895
 ] 

Arpit Agarwal commented on HDDS-3088:
-

cc [~shashikant], does this tie in with the retry settings you are looking at?

> maxRetries value is too large while trying to reconnect to SCM server
> -
>
> Key: HDDS-3088
> URL: https://issues.apache.org/jira/browse/HDDS-3088
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Nilotpal Nandi
>Assignee: Nanda kumar
>Priority: Major
>
> MaxRetries value is 2147483647 which is too high
> It keeps on retrying to connect to SCM server.
>  
> {noformat}
> 2020-02-27 05:54:43,430 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: quasar-hqknwz-8.quasar-hqknwz.root.hwx.site/172.27.14.1:9861. 
> Already tried 10535 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=2147483647, sleepTime=1000 
> MILLISECONDS)
> 2020-02-27 05:54:44,431 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: quasar-hqknwz-8.quasar-hqknwz.root.hwx.site/172.27.14.1:9861. 
> Already tried 10536 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=2147483647, sleepTime=1000 
> MILLISECONDS)
> 2020-02-27 05:54:45,432 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: quasar-hqknwz-8.quasar-hqknwz.root.hwx.site/172.27.14.1:9861. 
> Already tried 10537 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=2147483647, sleepTime=1000 
> MILLISECONDS)
> 2020-02-27 05:54:46,433 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: quasar-hqknwz-8.quasar-hqknwz.root.hwx.site/172.27.14.1:9861. 
> Already tried 10538 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=2147483647, sleepTime=1000 
> MILLISECONDS){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #399: HDDS-2424. Add the recover-trash command server side handling.

2020-03-26 Thread GitBox

bharatviswa504 commented on a change in pull request #399: HDDS-2424. Add the 
recover-trash command server side handling.
URL: https://github.com/apache/hadoop-ozone/pull/399#discussion_r398700964
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMTrashRecoverResponse.java
 ##
 @@ -0,0 +1,64 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.response.key;
+
+import org.apache.hadoop.ozone.OmUtils;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.RepeatedOmKeyInfo;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.OMResponse;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+
+import java.io.IOException;
+import javax.annotation.Nullable;
+import javax.annotation.Nonnull;
+
+/**
+ * Response for RecoverTrash request.
+ */
+public class OMTrashRecoverResponse extends OMClientResponse {
+  private OmKeyInfo omKeyInfo;
+
+  public OMTrashRecoverResponse(@Nullable OmKeyInfo omKeyInfo,
+  @Nonnull OMResponse omResponse) {
+super(omResponse);
+this.omKeyInfo = omKeyInfo;
+  }
+
+  @Override
+  public void addToDBBatch(OMMetadataManager omMetadataManager,
+  BatchOperation batchOperation) throws IOException {
+
+  /* TODO: HDDS-2425. HDDS-2426. */
+String trashKey = omMetadataManager
+.getOzoneKey(omKeyInfo.getVolumeName(),
+omKeyInfo.getBucketName(), omKeyInfo.getKeyName());
+RepeatedOmKeyInfo repeatedOmKeyInfo = omMetadataManager
+.getDeletedTable().get(trashKey);
+omKeyInfo = OmUtils.prepareKeyForRecover(omKeyInfo, repeatedOmKeyInfo);
+omMetadataManager.getDeletedTable()
+.deleteWithBatch(batchOperation, omKeyInfo.getKeyName());
+/* TODO: trashKey should be updated to destinationBucket. */
 
 Review comment:
   I am fine with recovering last delete key if that is the expected behavior.
   
   >  (And when recovering the latest key, I think we should clear the old 
deleted key.)
   We should not delete the other keys, as those keys will be picked by 
background trash service and the data for those keys need to be deleted.
   
   And also doing this way, is also not correct from my understanding, let us 
say, we put those keys in delete table, and background delete key service will 
pick them up and send to SCM for deletion, at this point we got a recover trash 
command, so there is a chance that we recover the key which might have no data, 
as we submitted the request to SCM for deletion, and SCM, in turn, it will send 
to DN. How we shall handle this kind of scenarios?
   
   Because deletion from delete table will happen when key purge request 
happens.
   
   Code snippet link [#link]( 
https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyDeletingService.java#L167
   )


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3285) MiniOzoneChaosCluster exits because of deadline exceeding

2020-03-26 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-3285:

Description: 
2020-03-26 21:26:48,869 [pool-326-thread-2] INFO  util.ExitUtil 
(ExitUtil.java:terminate(210)) - Exiting with status 1: java.io.IOException: 
java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.
grpc.StatusRuntimeException: DEADLINE_EXCEEDED: ClientCall started after 
deadline exceeded: -4.330590725s from now

{code}
2020-03-26 21:26:48,866 [pool-326-thread-2] ERROR loadgenerators.LoadExecutors 
(LoadExecutors.java:load(64)) - FileSystem LOADGEN: null Exiting due to 
exception
java.io.IOException: java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
ClientCall started after deadline exceeded: -4.330590725s from now
at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:359)
at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:281)
at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:259)
at 
org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:119)
at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:199)
at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:133)
at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:254)
at 
org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:197)
at 
org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:63)
at java.io.DataInputStream.read(DataInputStream.java:100)
at 
org.apache.hadoop.ozone.utils.LoadBucket$ReadOp.doPostOp(LoadBucket.java:205)
at 
org.apache.hadoop.ozone.utils.LoadBucket$Op.execute(LoadBucket.java:121)
at 
org.apache.hadoop.ozone.utils.LoadBucket$ReadOp.execute(LoadBucket.java:180)
at org.apache.hadoop.ozone.utils.LoadBucket.readKey(LoadBucket.java:82)
at 
org.apache.hadoop.ozone.loadgenerators.FilesystemLoadGenerator.generateLoad(FilesystemLoadGenerator.java:54)
at 
org.apache.hadoop.ozone.loadgenerators.LoadExecutors.load(LoadExecutors.java:62)
at 
org.apache.hadoop.ozone.loadgenerators.LoadExecutors.lambda$startLoad$0(LoadExecutors.java:78)
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
ClientCall started after deadline exceeded: -4.330590725s from now
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:336)
... 20 more
Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -4.330590725s 
from now
at 
org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:533)
at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442)
at 
org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
   at 
org.apache.ratis.thirdparty.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
at 
org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at 
org.apache.ratis.thirdparty.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
at

[jira] [Updated] (HDDS-3285) MiniOzoneChaosCluster exits because of deadline exceeding

2020-03-26 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-3285:

Description: 
2020-03-26 21:26:48,869 [pool-326-thread-2] INFO  util.ExitUtil 
(ExitUtil.java:terminate(210)) - Exiting with status 1: java.io.IOException: 
java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.
grpc.StatusRuntimeException: DEADLINE_EXCEEDED: ClientCall started after 
deadline exceeded: -4.330590725s from now

> MiniOzoneChaosCluster exits because of deadline exceeding
> -
>
> Key: HDDS-3285
> URL: https://issues.apache.org/jira/browse/HDDS-3285
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
> Attachments: complete.log.gz
>
>
> 2020-03-26 21:26:48,869 [pool-326-thread-2] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: java.io.IOException: 
> java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.
> grpc.StatusRuntimeException: DEADLINE_EXCEEDED: ClientCall started after 
> deadline exceeded: -4.330590725s from now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3285) MiniOzoneChaosCluster exits because of deadline exceeding

2020-03-26 Thread Mukul Kumar Singh (Jira)

Mukul Kumar Singh created HDDS-3285:
---

 Summary: MiniOzoneChaosCluster exits because of deadline exceeding
 Key: HDDS-3285
 URL: https://issues.apache.org/jira/browse/HDDS-3285
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Reporter: Mukul Kumar Singh
 Attachments: complete.log.gz





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3267) Replace ContainerCache in BlockUtils by LoadingCache

2020-03-26 Thread Mukul Kumar Singh (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067774#comment-17067774
 ] 

Mukul Kumar Singh commented on HDDS-3267:
-

[~esa.hekmat], I have added you as a contributor to the Ozone project and also 
assigned the jira as well.

> Replace ContainerCache in BlockUtils by LoadingCache
> 
>
> Key: HDDS-3267
> URL: https://issues.apache.org/jira/browse/HDDS-3267
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Isa Hekmatizadeh
>Assignee: Isa Hekmatizadeh
>Priority: Minor
>
> As discussed in [here|https://github.com/apache/hadoop-ozone/pull/705] 
> current version of ContainerCache is just used by BlockUtils and has several 
> architectural issues. for example:
>  * It uses a ReentrantLock which could be replaced by synchronized methods
>  * It should maintain a referenceCount for each DBHandler
>  * It extends LRUMap while it would be better to hide it by the composition 
> and not expose LRUMap related methods.
> As [~pifta] suggests, we could replace all ContainerCache functionality by 
> using Guava LoadingCache.
> This new LoadingCache could be configured to evict by size, by this 
> configuration the functionality would be slightly different as it may evict 
> DBHandlers while they are in use (referenceCount>0) but we can configure it 
> to use reference base eviction based on CacheBuilder.weakValues() 
> I want to open this discussion here instead of Github so I created this 
> ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-3267) Replace ContainerCache in BlockUtils by LoadingCache

2020-03-26 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned HDDS-3267:
---

Assignee: Isa Hekmatizadeh

> Replace ContainerCache in BlockUtils by LoadingCache
> 
>
> Key: HDDS-3267
> URL: https://issues.apache.org/jira/browse/HDDS-3267
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Isa Hekmatizadeh
>Assignee: Isa Hekmatizadeh
>Priority: Minor
>
> As discussed in [here|https://github.com/apache/hadoop-ozone/pull/705] 
> current version of ContainerCache is just used by BlockUtils and has several 
> architectural issues. for example:
>  * It uses a ReentrantLock which could be replaced by synchronized methods
>  * It should maintain a referenceCount for each DBHandler
>  * It extends LRUMap while it would be better to hide it by the composition 
> and not expose LRUMap related methods.
> As [~pifta] suggests, we could replace all ContainerCache functionality by 
> using Guava LoadingCache.
> This new LoadingCache could be configured to evict by size, by this 
> configuration the functionality would be slightly different as it may evict 
> DBHandlers while they are in use (referenceCount>0) but we can configure it 
> to use reference base eviction based on CacheBuilder.weakValues() 
> I want to open this discussion here instead of Github so I created this 
> ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on 
Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#issuecomment-604486699
 
 
   Thanks for quickly addressing the final issues. 
   
   I am +1 on this now. I will commit it later pending the CI checks looking 
good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

timmylicheng commented on issue #678: HDDS-3179 Pipeline placement based on 
Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#issuecomment-604466974
 
 
   > Thanks for the updates here. I think the code looks much cleaner now with 
the debug statements and refactored block in getResultSet().
   > 
   > There are just a couple of minor changes needed to finish this one off.
   
   Thanks for the detailed review. It really helps. @sodonnel 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

timmylicheng commented on a change in pull request #678: HDDS-3179 Pipeline 
placement based on Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#discussion_r398622023
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java
 ##
 @@ -403,7 +408,7 @@ private boolean checkAllNodesAreEqual(NetworkTopology 
topology) {
   @VisibleForTesting
   protected DatanodeDetails chooseNodeFromNetworkTopology(
   NetworkTopology networkTopology, DatanodeDetails anchor,
-  List excludedNodes) {
+  List excludedNodes) throws SCMException {
 
 Review comment:
   Updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

timmylicheng commented on a change in pull request #678: HDDS-3179 Pipeline 
placement based on Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#discussion_r398621936
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestPipelinePlacementPolicy.java
 ##
 @@ -43,36 +47,98 @@
   private MockNodeManager nodeManager;
   private OzoneConfiguration conf;
   private PipelinePlacementPolicy placementPolicy;
+  private NetworkTopologyImpl cluster;
   private static final int PIPELINE_PLACEMENT_MAX_NODES_COUNT = 10;
 
+  private List nodesWithOutRackAwareness = new ArrayList<>();
+  private List nodesWithRackAwareness = new ArrayList<>();
+
   @Before
   public void init() throws Exception {
-nodeManager = new MockNodeManager(true,
-PIPELINE_PLACEMENT_MAX_NODES_COUNT);
+cluster = initTopology();
+// start with nodes with rack awareness.
+nodeManager = new MockNodeManager(cluster, getNodesWithRackAwareness(),
+false, PIPELINE_PLACEMENT_MAX_NODES_COUNT);
 conf = new OzoneConfiguration();
 conf.setInt(OZONE_DATANODE_PIPELINE_LIMIT, 5);
 placementPolicy = new PipelinePlacementPolicy(
 nodeManager, new PipelineStateManager(), conf);
   }
 
+  private NetworkTopologyImpl initTopology() {
+NodeSchema[] schemas = new NodeSchema[]
+{ROOT_SCHEMA, RACK_SCHEMA, LEAF_SCHEMA};
+NodeSchemaManager.getInstance().init(schemas, true);
+NetworkTopologyImpl topology =
+new NetworkTopologyImpl(NodeSchemaManager.getInstance());
+return topology;
+  }
+
+  private List getNodesWithRackAwareness() {
+List datanodes = new ArrayList<>();
+for (Node node : NODES) {
+  DatanodeDetails datanode = overwriteLocationInNode(
+  getNodesWithoutRackAwareness(), node);
+  nodesWithRackAwareness.add(datanode);
+  datanodes.add(datanode);
+}
+return datanodes;
+  }
+
+  private DatanodeDetails getNodesWithoutRackAwareness() {
+DatanodeDetails node = MockDatanodeDetails.randomDatanodeDetails();
+nodesWithOutRackAwareness.add(node);
+return node;
+  }
+
   @Test
-  public void testChooseNodeBasedOnNetworkTopology() {
-List healthyNodes =
-nodeManager.getNodes(HddsProtos.NodeState.HEALTHY);
-DatanodeDetails anchor = placementPolicy.chooseNode(healthyNodes);
+  public void testChooseNodeBasedOnNetworkTopology() throws SCMException {
+DatanodeDetails anchor = 
placementPolicy.chooseNode(nodesWithRackAwareness);
 // anchor should be removed from healthyNodes after being chosen.
-Assert.assertFalse(healthyNodes.contains(anchor));
+Assert.assertFalse(nodesWithRackAwareness.contains(anchor));
 
 List excludedNodes =
 new ArrayList<>(PIPELINE_PLACEMENT_MAX_NODES_COUNT);
 excludedNodes.add(anchor);
 DatanodeDetails nextNode = placementPolicy.chooseNodeFromNetworkTopology(
 nodeManager.getClusterNetworkTopologyMap(), anchor, excludedNodes);
 Assert.assertFalse(excludedNodes.contains(nextNode));
-// nextNode should not be the same as anchor.
+// next node should not be the same as anchor.
 Assert.assertTrue(anchor.getUuid() != nextNode.getUuid());
+// next node should be on the same rack based on topology.
+Assert.assertEquals(anchor.getNetworkLocation(),
+nextNode.getNetworkLocation());
   }
 
+  @Test
+  public void testChooseNodeWithSingleNodeRack() throws SCMException {
+// There is only one node on 3 racks altogether.
+List datanodes = new ArrayList<>();
+for (Node node : SINGLE_NODE_RACK) {
+  DatanodeDetails datanode = overwriteLocationInNode(
+  MockDatanodeDetails.randomDatanodeDetails(), node);
+  datanodes.add(datanode);
+}
+MockNodeManager localNodeManager = new MockNodeManager(null, datanodes,
 
 Review comment:
   You are right. I updated this part. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on 
Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#issuecomment-604452924
 
 
   Thanks for the updates here. I think the code looks much cleaner now with 
the debug statements and refactored block in getResultSet().
   
   There are just a couple of minor changes needed to finish this one off.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

sodonnel commented on a change in pull request #678: HDDS-3179 Pipeline 
placement based on Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#discussion_r398601166
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestPipelinePlacementPolicy.java
 ##
 @@ -43,36 +47,98 @@
   private MockNodeManager nodeManager;
   private OzoneConfiguration conf;
   private PipelinePlacementPolicy placementPolicy;
+  private NetworkTopologyImpl cluster;
   private static final int PIPELINE_PLACEMENT_MAX_NODES_COUNT = 10;
 
+  private List nodesWithOutRackAwareness = new ArrayList<>();
+  private List nodesWithRackAwareness = new ArrayList<>();
+
   @Before
   public void init() throws Exception {
-nodeManager = new MockNodeManager(true,
-PIPELINE_PLACEMENT_MAX_NODES_COUNT);
+cluster = initTopology();
+// start with nodes with rack awareness.
+nodeManager = new MockNodeManager(cluster, getNodesWithRackAwareness(),
+false, PIPELINE_PLACEMENT_MAX_NODES_COUNT);
 conf = new OzoneConfiguration();
 conf.setInt(OZONE_DATANODE_PIPELINE_LIMIT, 5);
 placementPolicy = new PipelinePlacementPolicy(
 nodeManager, new PipelineStateManager(), conf);
   }
 
+  private NetworkTopologyImpl initTopology() {
+NodeSchema[] schemas = new NodeSchema[]
+{ROOT_SCHEMA, RACK_SCHEMA, LEAF_SCHEMA};
+NodeSchemaManager.getInstance().init(schemas, true);
+NetworkTopologyImpl topology =
+new NetworkTopologyImpl(NodeSchemaManager.getInstance());
+return topology;
+  }
+
+  private List getNodesWithRackAwareness() {
+List datanodes = new ArrayList<>();
+for (Node node : NODES) {
+  DatanodeDetails datanode = overwriteLocationInNode(
+  getNodesWithoutRackAwareness(), node);
+  nodesWithRackAwareness.add(datanode);
+  datanodes.add(datanode);
+}
+return datanodes;
+  }
+
+  private DatanodeDetails getNodesWithoutRackAwareness() {
+DatanodeDetails node = MockDatanodeDetails.randomDatanodeDetails();
+nodesWithOutRackAwareness.add(node);
+return node;
+  }
+
   @Test
-  public void testChooseNodeBasedOnNetworkTopology() {
-List healthyNodes =
-nodeManager.getNodes(HddsProtos.NodeState.HEALTHY);
-DatanodeDetails anchor = placementPolicy.chooseNode(healthyNodes);
+  public void testChooseNodeBasedOnNetworkTopology() throws SCMException {
+DatanodeDetails anchor = 
placementPolicy.chooseNode(nodesWithRackAwareness);
 // anchor should be removed from healthyNodes after being chosen.
-Assert.assertFalse(healthyNodes.contains(anchor));
+Assert.assertFalse(nodesWithRackAwareness.contains(anchor));
 
 List excludedNodes =
 new ArrayList<>(PIPELINE_PLACEMENT_MAX_NODES_COUNT);
 excludedNodes.add(anchor);
 DatanodeDetails nextNode = placementPolicy.chooseNodeFromNetworkTopology(
 nodeManager.getClusterNetworkTopologyMap(), anchor, excludedNodes);
 Assert.assertFalse(excludedNodes.contains(nextNode));
-// nextNode should not be the same as anchor.
+// next node should not be the same as anchor.
 Assert.assertTrue(anchor.getUuid() != nextNode.getUuid());
+// next node should be on the same rack based on topology.
+Assert.assertEquals(anchor.getNetworkLocation(),
+nextNode.getNetworkLocation());
   }
 
+  @Test
+  public void testChooseNodeWithSingleNodeRack() throws SCMException {
+// There is only one node on 3 racks altogether.
+List datanodes = new ArrayList<>();
+for (Node node : SINGLE_NODE_RACK) {
+  DatanodeDetails datanode = overwriteLocationInNode(
+  MockDatanodeDetails.randomDatanodeDetails(), node);
+  datanodes.add(datanode);
+}
+MockNodeManager localNodeManager = new MockNodeManager(null, datanodes,
 
 Review comment:
   This test doesn't reproduce the error if the fix in this PR is removed. I 
commented out the fix in PipelinePlacement policy and added back in the old 
logic and ran this and it still passed. The reason, is that 
PipelinePlacementPolicy did not believe networkTopology was present. Changing 
this line as follows makes the test reproduce the problem:
   
   ```
   MockNodeManager localNodeManager = new MockNodeManager(initTopology(), 
datanodes,
   false, datanodes.size());
   ```
   Note I added `initTopology()` when constructing the NodeManager.
   
   With that, the test fails without the fix, and passes with the fix in this 
PR, so that is good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe,

[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

sodonnel commented on a change in pull request #678: HDDS-3179 Pipeline 
placement based on Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#discussion_r398582704
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java
 ##
 @@ -403,7 +408,7 @@ private boolean checkAllNodesAreEqual(NetworkTopology 
topology) {
   @VisibleForTesting
   protected DatanodeDetails chooseNodeFromNetworkTopology(
   NetworkTopology networkTopology, DatanodeDetails anchor,
-  List excludedNodes) {
+  List excludedNodes) throws SCMException {
 
 Review comment:
   You can remove the `throws SCMException` in the method definition now, as it 
is no longer used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

2020-03-26 Thread GitBox

sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on 
Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#issuecomment-604425486
 
 
   Acceptance is failing with:
   
   ```

==
   Execute PI calculation| FAIL 
|
   1 != 0
   
--
   Execute WordCount | FAIL 
|
   1 != 0
   
--
   ozonesecure-mr-mapreduce :: Execute MR jobs   | FAIL 
|
   ```
   
   This is known issue and will be fixed soon - HDDS-3284.
   
   Integration tests are flaky. it-Freon:
   
   ```
   [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
395.168 s <<< FAILURE! - in org.apache.hadoop.ozone.freon.TestRandomKeyGenerator
   [ERROR] bigFileThan2GB(org.apache.hadoop.ozone.freon.TestRandomKeyGenerator) 
 Time elapsed: 326.297 s  <<< FAILURE!
   java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
   ```
   
   One of these issues covers it:  
https://issues.apache.org/jira/browse/HDDS-3266 and it-freon: 
https://issues.apache.org/jira/browse/HDDS-3257
   
   it-client I am not sure


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3267) Replace ContainerCache in BlockUtils by LoadingCache

2020-03-26 Thread Isa Hekmatizadeh (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067652#comment-17067652
 ] 

Isa Hekmatizadeh commented on HDDS-3267:


please assign this task to me, I'm currently working on it

> Replace ContainerCache in BlockUtils by LoadingCache
> 
>
> Key: HDDS-3267
> URL: https://issues.apache.org/jira/browse/HDDS-3267
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Isa Hekmatizadeh
>Priority: Minor
>
> As discussed in [here|https://github.com/apache/hadoop-ozone/pull/705] 
> current version of ContainerCache is just used by BlockUtils and has several 
> architectural issues. for example:
>  * It uses a ReentrantLock which could be replaced by synchronized methods
>  * It should maintain a referenceCount for each DBHandler
>  * It extends LRUMap while it would be better to hide it by the composition 
> and not expose LRUMap related methods.
> As [~pifta] suggests, we could replace all ContainerCache functionality by 
> using Guava LoadingCache.
> This new LoadingCache could be configured to evict by size, by this 
> configuration the functionality would be slightly different as it may evict 
> DBHandlers while they are in use (referenceCount>0) but we can configure it 
> to use reference base eviction based on CacheBuilder.weakValues() 
> I want to open this discussion here instead of Github so I created this 
> ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

sodonnel commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to 
lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725#issuecomment-604396747
 
 
   Yea, I have seen random failures on it-client. As this is a yarn only 
change, I cannot see how it could impact the integration tests, so I think we 
are good to merge it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

adoroszlai commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to 
lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725#issuecomment-604392865
 
 
   Thanks @sodonnel.  Since it only changes acceptance tests, I think we can 
merge it even if `it-client` happens to fail.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

sodonnel commented on issue #725: HDDS-3284. ozonesecure-mr test fails due to 
lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725#issuecomment-604386700
 
 
   Thanks for looking into this. +1 on this from me, pending CI, but some of 
the "it-*" tests are a bit flaky.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-3271) The block file is not deleted after the key is deleted

2020-03-26 Thread mingchao zhao (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067551#comment-17067551
 ] 

mingchao zhao edited comment on HDDS-3271 at 3/26/20, 10:37 AM:


Hi [~msingh] Thank you for your reply.

>The SCM maintains a log of deletes blocks and deletes them after the container 
>is closed.

By testing, I found that after closing the container, the block is actually 
removed. 

I found that ozone has a configuration ozone.block.deleting.service.interval 
(default 60sec).It's going to be executed every minute.  It is true that block 
will be remove  [when the container is 
closed|https://github.com/apache/hadoop-ozone/blob/c8f14a560beb9a83c7d98388614a5ba36d7638f6/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DatanodeDeletedBlockTransactions.java#L66].

 


was (Author: micahzhao):
Hi [~msingh] Thank you for your discussion.

>The SCM maintains a log of deletes blocks and deletes them after the container 
>is closed.

By testing, I found that after closing the container, the block is actually 
removed. 

I found that ozone has a configuration ozone.block.deleting.service.interval 
(default 60sec).It's going to be executed every minute.  It is true that block 
will be remove  [when the container is 
closed|https://github.com/apache/hadoop-ozone/blob/c8f14a560beb9a83c7d98388614a5ba36d7638f6/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DatanodeDeletedBlockTransactions.java#L66].

 

> The block file is not deleted after the key is deleted
> --
>
> Key: HDDS-3271
> URL: https://issues.apache.org/jira/browse/HDDS-3271
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-03-25-11-41-26-972.png
>
>
> When I successfully deleted the key, I was still able to see the block file 
> in the chunk directory. Block files are not deleted altogether.
> !image-2020-03-25-11-41-26-972.png|width=1169,height=143!
> This may be an existing bug, and I will confirm the reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-3271) The block file is not deleted after the key is deleted

2020-03-26 Thread mingchao zhao (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067551#comment-17067551
 ] 

mingchao zhao edited comment on HDDS-3271 at 3/26/20, 10:36 AM:


Hi [~msingh] Thank you for your discussion.

>The SCM maintains a log of deletes blocks and deletes them after the container 
>is closed.

By testing, I found that after closing the container, the block is actually 
removed. 

I found that ozone has a configuration ozone.block.deleting.service.interval 
(default 60sec).It's going to be executed every minute.  It is true that block 
will be remove  [when the container is 
closed|https://github.com/apache/hadoop-ozone/blob/c8f14a560beb9a83c7d98388614a5ba36d7638f6/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DatanodeDeletedBlockTransactions.java#L66].

 


was (Author: micahzhao):
Hi [~msingh] Thank you for your discussion.

>The SCM maintains a log of deletes blocks and deletes them after the container 
>is closed.

By testing, I found that after closing the container, the block is actually 
removed.  Except when the container is closed,  there is any other strategies 
for timed deletion？

I found that ozone has a configuration ozone.block.deleting.service.interval 
(default 60sec).This should be deleted every minute, but it does not take 
effect.

> The block file is not deleted after the key is deleted
> --
>
> Key: HDDS-3271
> URL: https://issues.apache.org/jira/browse/HDDS-3271
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-03-25-11-41-26-972.png
>
>
> When I successfully deleted the key, I was still able to see the block file 
> in the chunk directory. Block files are not deleted altogether.
> !image-2020-03-25-11-41-26-972.png|width=1169,height=143!
> This may be an existing bug, and I will confirm the reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3284) ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-3284:
---
Status: Patch Available  (was: In Progress)

> ozonesecure-mr test fails due to lack of disk space
> ---
>
> Key: HDDS-3284
> URL: https://issues.apache.org/jira/browse/HDDS-3284
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{ozonesecure-mr}} acceptance test is failing with {{No space available in 
> any of the local directories.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai opened a new pull request #725: HDDS-3284. ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread GitBox

adoroszlai opened a new pull request #725: HDDS-3284. ozonesecure-mr test fails 
due to lack of disk space
URL: https://github.com/apache/hadoop-ozone/pull/725
 
 
   ## What changes were proposed in this pull request?
   
   Disable YARN disk utilization check in `ozonesecure-mr` acceptance test.
   
   Plenty of disk space is available in CI, but more than 90% of the disk is 
used:
   
   ```
   Filesystem  Size  Used Avail Use% Mounted on
   ...
   /dev/sda184G   75G  8.3G  91% /
   ```
   
   Thus directory checker marks it as invalid:
   
   ```
   WARN  DirectoryCollection:418 - Directory /tmp/hadoop-hadoop/nm-local-dir 
error, used space above threshold of 90.0%, removing from list of valid 
directories
   WARN  DirectoryCollection:418 - Directory /opt/hadoop/logs/userlogs error, 
used space above threshold of 90.0%, removing from list of valid directories
   ```
   
   https://issues.apache.org/jira/browse/HDDS-3284
   
   ## How was this patch tested?
   
   https://github.com/adoroszlai/hadoop-ozone/runs/535925282
   
   ```
   
==
   Execute PI calculation| PASS 
|
   
--
   Execute WordCount | PASS 
|
   
--
   ozonesecure-mr-mapreduce :: Execute MR jobs   | PASS 
|
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3284) ozonesecure-mr test fails due to lack of disk space

2020-03-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3284:
-
Labels: pull-request-available  (was: )

> ozonesecure-mr test fails due to lack of disk space
> ---
>
> Key: HDDS-3284
> URL: https://issues.apache.org/jira/browse/HDDS-3284
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> {{ozonesecure-mr}} acceptance test is failing with {{No space available in 
> any of the local directories.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3271) The block file is not deleted after the key is deleted

2020-03-26 Thread mingchao zhao (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067551#comment-17067551
 ] 

mingchao zhao commented on HDDS-3271:
-

Hi [~msingh] Thank you for your discussion.

>The SCM maintains a log of deletes blocks and deletes them after the container 
>is closed.

By testing, I found that after closing the container, the block is actually 
removed.  Except when the container is closed,  there is any other strategies 
for timed deletion？

I found that ozone has a configuration ozone.block.deleting.service.interval 
(default 60sec).This should be deleted every minute, but it does not take 
effect.

> The block file is not deleted after the key is deleted
> --
>
> Key: HDDS-3271
> URL: https://issues.apache.org/jira/browse/HDDS-3271
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-03-25-11-41-26-972.png
>
>
> When I successfully deleted the key, I was still able to see the block file 
> in the chunk directory. Block files are not deleted altogether.
> !image-2020-03-25-11-41-26-972.png|width=1169,height=143!
> This may be an existing bug, and I will confirm the reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng edited a comment on issue #720: HDDS-3185 Construct a standalone ratis server for SCM.

2020-03-26 Thread GitBox

timmylicheng edited a comment on issue #720: HDDS-3185 Construct a standalone 
ratis server for SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720#issuecomment-604254744
 
 
   @elek Hey Marton,
   Sorry this patch was meant to merge into HDDS-2823 as a dev branch for SCM 
HA. 
   
   So far the doc work is ongoing in parallel with prototyping.
   The doc still needs improvements and I will find time to work on that. 
   This patch is only to work on a prototype of SCM HA to get a feeling of it. 
To have Ratis server and state machine for SCM doesn't have debates. Current 
discussion for design is to collect how to handle Raft transaction for all 
different types of actions and reports on SCM. Besides going thru the current 
code base to make analysis, I'm also probing by prototyping. I start off 
implementing a stand alone Ratis Server and make steps. 
   
   I will try to finalize the design doc and schedule a call with the community 
to include more guys. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] timmylicheng commented on issue #720: HDDS-3185 Construct a standalone ratis server for SCM.

2020-03-26 Thread GitBox

timmylicheng commented on issue #720: HDDS-3185 Construct a standalone ratis 
server for SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720#issuecomment-604254744
 
 
   @elek Hey Marton,
   So far the doc work is ongoing in parallel with prototyping.
   The doc still needs improvements and I will find time to work on that. 
   This patch is only to work on a prototype of SCM HA to get a feeling of it. 
To have Ratis server and state machine for SCM doesn't have debates. Current 
discussion for design is to collect how to handle Raft transaction for all 
different types of actions and reports on SCM. Besides going thru the current 
code base to make analysis, I'm also probing by prototyping. I start off 
implementing a stand alone Ratis Server and make steps. 
   
   I will try to finalize the design doc and schedule a call with the community 
to include more guys. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3001) NFS support for Ozone

2020-03-26 Thread Prashant Pogde (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067384#comment-17067384
 ] 

Prashant Pogde commented on HDDS-3001:
--

Attaching the design document. Please take a look.

> NFS support for Ozone
> -
>
> Key: HDDS-3001
> URL: https://issues.apache.org/jira/browse/HDDS-3001
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Filesystem
>Affects Versions: 0.5.0
>Reporter: Prashant Pogde
>Assignee: Prashant Pogde
>Priority: Major
> Attachments: NFS Support for Ozone.pdf
>
>
> Provide NFS support for Ozone



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3001) NFS support for Ozone

2020-03-26 Thread Prashant Pogde (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Pogde updated HDDS-3001:
-
Attachment: NFS Support for Ozone.pdf

> NFS support for Ozone
> -
>
> Key: HDDS-3001
> URL: https://issues.apache.org/jira/browse/HDDS-3001
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Filesystem
>Affects Versions: 0.5.0
>Reporter: Prashant Pogde
>Assignee: Prashant Pogde
>Priority: Major
> Attachments: NFS Support for Ozone.pdf
>
>
> Provide NFS support for Ozone



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-3254) Datanode memory increase so much

2020-03-26 Thread runzhiwang (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067371#comment-17067371
 ] 

runzhiwang edited comment on HDDS-3254 at 3/26/20, 6:06 AM:


[~shashikant] I have stop send request to s3gateway for one day, but the 
datanode physical memory does not come down, and the CPU is more than 100%.  I 
will try to reproduce it with some jvm options. And I will check RetryCache 
object size.

-Xms10g -Xmx10g -Xmn4g  -XX:+UseParallelGC -XX:MetaspaceSize=256m 
-XX:MaxMetaspaceSize=1024m -XX:SurvivorRatio=4 -verbose:gc 
-Xloggc:/var/datanode_gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/datanode_dump.hprof 
-XX:ParallelGCThreads=8 -XX:CMSInitiatingOccupancyFraction=70 
-XX:+UseCMSInitiatingOccupancyOnly


was (Author: yjxxtd):
[~shashikant] I have stop send request to s3gateway for one day, but the 
datanode physical memory does not come down, and the CPU is more than 100%.  I 
will try to reproduce it with some jvm options. And I will check RetryCache 
object size.

> Datanode memory increase so much
> 
>
> Key: HDDS-3254
> URL: https://issues.apache.org/jira/browse/HDDS-3254
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: image-2020-03-24-10-05-41-212.png, 
> image-2020-03-24-16-32-43-973.png, image-2020-03-24-16-33-20-795.png
>
>
> As the image shows, the physical memory of datanode increase to 11.2GB, and 
> then crash. I will find out the root cause.
>  !image-2020-03-24-10-05-41-212.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-3254) Datanode memory increase so much

2020-03-26 Thread runzhiwang (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067371#comment-17067371
 ] 

runzhiwang edited comment on HDDS-3254 at 3/26/20, 6:06 AM:


[~shashikant] I have stop send request to s3gateway for one day, but the 
datanode physical memory does not come down, and the CPU is more than 100%.  I 
will try to reproduce it with following  jvm options. And I will check 
RetryCache object size.

-Xms10g -Xmx10g -Xmn4g  -XX:+UseParallelGC -XX:MetaspaceSize=256m 
-XX:MaxMetaspaceSize=1024m -XX:SurvivorRatio=4 -verbose:gc 
-Xloggc:/var/datanode_gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/datanode_dump.hprof 
-XX:ParallelGCThreads=8 -XX:CMSInitiatingOccupancyFraction=70 
-XX:+UseCMSInitiatingOccupancyOnly


was (Author: yjxxtd):
[~shashikant] I have stop send request to s3gateway for one day, but the 
datanode physical memory does not come down, and the CPU is more than 100%.  I 
will try to reproduce it with some jvm options. And I will check RetryCache 
object size.

-Xms10g -Xmx10g -Xmn4g  -XX:+UseParallelGC -XX:MetaspaceSize=256m 
-XX:MaxMetaspaceSize=1024m -XX:SurvivorRatio=4 -verbose:gc 
-Xloggc:/var/datanode_gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/datanode_dump.hprof 
-XX:ParallelGCThreads=8 -XX:CMSInitiatingOccupancyFraction=70 
-XX:+UseCMSInitiatingOccupancyOnly

> Datanode memory increase so much
> 
>
> Key: HDDS-3254
> URL: https://issues.apache.org/jira/browse/HDDS-3254
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: image-2020-03-24-10-05-41-212.png, 
> image-2020-03-24-16-32-43-973.png, image-2020-03-24-16-33-20-795.png
>
>
> As the image shows, the physical memory of datanode increase to 11.2GB, and 
> then crash. I will find out the root cause.
>  !image-2020-03-24-10-05-41-212.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

49 matches

Mail list logo