[jira] [Updated] (HDDS-3951) Rename the num.write.chunk.thread key

2020-07-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3951:
-
Labels: pull-request-available  (was: )

> Rename the num.write.chunk.thread key
> -
>
> Key: HDDS-3951
> URL: https://issues.apache.org/jira/browse/HDDS-3951
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>  Labels: pull-request-available
>
> dfs.container.ratis.num.write.chunk.thread -> 
> dfs.container.ratis.num.write.chunk.thread.per.disk




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong opened a new pull request #1187: HDDS-3951. Rename the num.write.chunk.thread key.

2020-07-09 Thread GitBox


maobaolong opened a new pull request #1187:
URL: https://github.com/apache/hadoop-ozone/pull/1187


   ## What changes were proposed in this pull request?
   
   Reanme `dfs.container.ratis.num.write.chunk.thread` to 
`dfs.container.ratis.num.write.chunk.thread.per.disk
`
   
   to clearly express the meaning of this key, if it configured to 10, then the 
thread num would be 10 * numOfDisk. 
   
   
   ## What is the link to the Apache JIRA
   
   HDDS-3951
   
   ## How was this patch tested?
   
   NO Need, just rename config key.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3951) Rename the num.write.chunk.thread key

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3951:
-
Description: dfs.container.ratis.num.write.chunk.thread -> 
dfs.container.ratis.num.write.chunk.thread.per.disk
  (was: 
dfs.container.ratis.num.write.chunk.thread -> 
dfs.container.ratis.num.write.chunk.thread.per.volume
)

> Rename the num.write.chunk.thread key
> -
>
> Key: HDDS-3951
> URL: https://issues.apache.org/jira/browse/HDDS-3951
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>
> dfs.container.ratis.num.write.chunk.thread -> 
> dfs.container.ratis.num.write.chunk.thread.per.disk




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1150: HDDS-3903. OzoneRpcClient support batch rename keys.

2020-07-09 Thread GitBox


captainzmc commented on a change in pull request #1150:
URL: https://github.com/apache/hadoop-ozone/pull/1150#discussion_r452634999



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMKeysRenameResponse.java
##
@@ -0,0 +1,135 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.response.key;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Optional;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.hadoop.hdds.utils.db.cache.CacheKey;
+import org.apache.hadoop.hdds.utils.db.cache.CacheValue;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OmRenameKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.response.CleanupTableInfo;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+
+import javax.annotation.Nonnull;
+import java.io.IOException;
+import java.util.List;
+
+import static org.apache.hadoop.ozone.om.OmMetadataManagerImpl.KEY_TABLE;
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.BUCKET_LOCK;
+
+/**
+ * Response for RenameKeys request.
+ */
+@CleanupTableInfo(cleanupTables = {KEY_TABLE})
+public class OMKeysRenameResponse extends OMClientResponse {
+
+  private List renameKeyInfoList;
+  private long trxnLogIndex;
+  private String fromKeyName = null;
+  private String toKeyName = null;
+
+  public OMKeysRenameResponse(@Nonnull OMResponse omResponse,
+  List renameKeyInfoList,
+  long trxnLogIndex) {
+super(omResponse);
+this.renameKeyInfoList = renameKeyInfoList;
+this.trxnLogIndex = trxnLogIndex;
+  }
+
+
+  /**
+   * For when the request is not successful or it is a replay transaction.
+   * For a successful request, the other constructor should be used.
+   */
+  public OMKeysRenameResponse(@Nonnull OMResponse omResponse) {
+super(omResponse);
+checkStatusNotOK();
+  }
+
+  @Override
+  public void addToDBBatch(OMMetadataManager omMetadataManager,
+   BatchOperation batchOperation) throws IOException {
+boolean acquiredLock = false;
+for (OmRenameKeyInfo omRenameKeyInfo : renameKeyInfoList) {
+  String volumeName = omRenameKeyInfo.getNewKeyInfo().getVolumeName();
+  String bucketName = omRenameKeyInfo.getNewKeyInfo().getBucketName();
+  fromKeyName = omRenameKeyInfo.getFromKeyName();
+  OmKeyInfo newKeyInfo = omRenameKeyInfo.getNewKeyInfo();
+  toKeyName = newKeyInfo.getKeyName();
+  Table keyTable = omMetadataManager
+  .getKeyTable();
+  try {
+acquiredLock =
+omMetadataManager.getLock().acquireWriteLock(BUCKET_LOCK,
+volumeName, bucketName);
+// If toKeyName is null, then we need to only delete the fromKeyName
+// from KeyTable. This is the case of replay where toKey exists but
+// fromKey has not been deleted.
+if (deleteFromKeyOnly()) {

Review comment:
   Thanks Bharat for the suggestion, I have taken a close look at the 
implementation of #1169 with some very nice changes. In this PR I will 
synchronize the #1169 changes here to make sure they are implemented the same.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3951) Rename the num.write.chunk.thread key

2020-07-09 Thread maobaolong (Jira)
maobaolong created HDDS-3951:


 Summary: Rename the num.write.chunk.thread key
 Key: HDDS-3951
 URL: https://issues.apache.org/jira/browse/HDDS-3951
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Affects Versions: 0.5.0
Reporter: maobaolong
Assignee: maobaolong


dfs.container.ratis.num.write.chunk.thread -> 
dfs.container.ratis.num.write.chunk.thread.per.volume




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3813) Integrate with ratis 1.0.0 release binaries for ozone 0.6.0 release

2020-07-09 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-3813:
-
Summary: Integrate with ratis 1.0.0 release binaries for ozone 0.6.0 
release  (was: Integrate with ratis 0.6.0 release binaries for ozone 0.6.0 
release)

> Integrate with ratis 1.0.0 release binaries for ozone 0.6.0 release
> ---
>
> Key: HDDS-3813
> URL: https://issues.apache.org/jira/browse/HDDS-3813
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Sammi Chen
>Assignee: Lokesh Jain
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3833) Use Pipeline choose policy to choose pipeline from exist pipeline list

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3833:
-
Target Version/s: 0.7.0

> Use Pipeline choose policy to choose pipeline from exist pipeline list
> --
>
> Key: HDDS-3833
> URL: https://issues.apache.org/jira/browse/HDDS-3833
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>  Labels: pull-request-available
>
> With this policy driven mode, we can develop various pipeline choosing policy 
> to satisfy complex production environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3885) Create Datanode home page

2020-07-09 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-3885:
-
 Target Version/s: 0.6.0
Affects Version/s: (was: 0.6.0)
   0.5.0

> Create Datanode home page
> -
>
> Key: HDDS-3885
> URL: https://issues.apache.org/jira/browse/HDDS-3885
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3941) Enable core dump when crash in C++

2020-07-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3941:
-
Labels: pull-request-available  (was: )

> Enable core dump when crash in C++
> --
>
> Key: HDDS-3941
> URL: https://issues.apache.org/jira/browse/HDDS-3941
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] runzhiwang opened a new pull request #1186: HDDS-3941. Enable core dump when crash in C++

2020-07-09 Thread GitBox


runzhiwang opened a new pull request #1186:
URL: https://github.com/apache/hadoop-ozone/pull/1186


   ## What changes were proposed in this pull request?
   
   **What's the problem ?**
   This PR is related to HDDS-3933. Fix memory leak because of too many 
Datanode State Machine Thread.
   When memory leak, Datanode most time generates core.pid because it crash in 
Rocksdb when create new thread, as the image shows, and generates crash log 
rarely.
   
![image](https://user-images.githubusercontent.com/51938049/87117625-97c9c380-c2ab-11ea-96e2-58c197cb6275.png)
   
   But because the default value of `core file size` if zero, so core.pid can 
not be generated. So when Datanode crash in Rocksdb, we can not get any 
information about why it crashed. 
   
   **How to fix ?**
   Set `ulimit -c unlimited` to enable core dump when crash in RocksDB.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3941
   
   ## How was this patch tested?
   
   Existed UT.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3933) Fix memory leak because of too many Datanode State Machine Thread

2020-07-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3933:
-
Labels: pull-request-available  (was: )

> Fix memory leak because of too many Datanode State Machine Thread
> -
>
> Key: HDDS-3933
> URL: https://issues.apache.org/jira/browse/HDDS-3933
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
> Attachments: jstack.txt, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png
>
>
> When create 22345th  Datanode State Machine Thread, OOM happened.
> !screenshot-1.png! 
>  !screenshot-2.png! 
>  !screenshot-3.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] runzhiwang opened a new pull request #1185: HDDS-3933. Fix memory leak because of too many Datanode State Machine Thread

2020-07-09 Thread GitBox


runzhiwang opened a new pull request #1185:
URL: https://github.com/apache/hadoop-ozone/pull/1185


   ## What changes were proposed in this pull request?
   
   **What's problem ?**
   Datanode creates more than 20K Datanode State Machine Thread, then OOM 
happened.
   
![image](https://user-images.githubusercontent.com/51938049/87116288-fdb44c00-c2a7-11ea-80b0-e3f77e1fe3ab.png)
   
   **What's the reason ?**
   20K Datanode State Machine Thread were created by newCachedThreadPool
   
![image](https://user-images.githubusercontent.com/51938049/87116364-2fc5ae00-c2a8-11ea-98c9-604aa027c349.png)
   
   Almost all of them were wait lock.
   
![image](https://user-images.githubusercontent.com/51938049/87116509-8cc16400-c2a8-11ea-82d5-c23dfa5438f0.png)
   
   Only one Datanode State Machine Thread got the lock, and block when 
submitRequest. Because this thread was blocked and can not free the lock, 
newCachedThreadPool will create new thread infinitely.
   
![image](https://user-images.githubusercontent.com/51938049/87116744-2ab52e80-c2a9-11ea-8ba1-b57e1cade46d.png)
   
   **How to fix ?**
   1. Avoid use newCachedThreadPool, because it will create new thread 
infinitely, if no thread available in pool.
   2. Cancel future when task time out.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3933
   
   ## How was this patch tested?
   
   Existed UT.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452613394



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -848,7 +850,18 @@ message DeleteKeyRequest {
 }
 
 message DeleteKeysRequest {
-repeated KeyArgs keyArgs = 1;
+optional DeleteKeyArgs deleteKeys = 1;

Review comment:
   From proto 3 onwards all fields are optional.
   So, followed that approach and declared optional. (In future if something is 
changed, we can still be backward compatible)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452613394



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -848,7 +850,18 @@ message DeleteKeyRequest {
 }
 
 message DeleteKeysRequest {
-repeated KeyArgs keyArgs = 1;
+optional DeleteKeyArgs deleteKeys = 1;

Review comment:
   In proto 3 onwards all fields are optional.
   So, followed that approach and declared optional. (In future if something is 
changed, we can still be backward compatible)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Comment: was deleted

(was: [WARNING] Tests run: 68, Failures: 0, Errors: 0, Skipped: 3, Time 
elapsed: 58.105 s - in 
org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower:225)

> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> [WARNING] Tests run: 68, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 
> 58.105 s - in org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
> [INFO]
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR] TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower:225



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Description: 
[WARNING] Tests run: 68, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 
58.105 s - in org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower:225

  was:If a failure test appeared in your CI checks, and you are sure it is not 
relation with your PR, so, paste the stale test log here.


> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> [WARNING] Tests run: 68, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 
> 58.105 s - in org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
> [INFO]
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR] TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower:225



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Summary: FLAKY-UT: TestSecureOzoneRpcClient timeout  (was: Stale 
tests(timeout or other reason))

> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Comment: was deleted

(was: {code:}
[INFO] Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO] 
Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
FAILURE! - in 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
  Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
org.junit.Assert.fail(Assert.java:86) at 
org.junit.Assert.assertTrue(Assert.java:41) at 
org.junit.Assert.assertTrue(Assert.java:52) at 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
)

> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141235#comment-17141235
 ] 

maobaolong edited comment on HDDS-3841 at 7/10/20, 4:13 AM:


{code:}
[INFO] Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO] 
Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
FAILURE! - in 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
  Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
org.junit.Assert.fail(Assert.java:86) at 
org.junit.Assert.assertTrue(Assert.java:41) at 
org.junit.Assert.assertTrue(Assert.java:52) at 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}



was (Author: maobaolong):
{code:}
[INFO] Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO] 
Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
FAILURE! - in 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
  Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
org.junit.Assert.fail(Assert.java:86) at 
org.junit.Assert.assertTrue(Assert.java:41) at 
org.junit.Assert.assertTrue(Assert.java:52) at 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}


> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


xiaoyuyao commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452612030



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -867,10 +867,10 @@ message DeletedKeys {
 }
 
 message DeleteKeysResponse {
-repeated KeyInfo deletedKeys = 1;
-repeated KeyInfo unDeletedKeys = 2;

Review comment:
   Looks good to me. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


xiaoyuyao commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452611714



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -848,7 +850,18 @@ message DeleteKeyRequest {
 }
 
 message DeleteKeysRequest {
-repeated KeyArgs keyArgs = 1;
+optional DeleteKeyArgs deleteKeys = 1;

Review comment:
   Should this be required?

##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -848,7 +850,18 @@ message DeleteKeyRequest {
 }
 
 message DeleteKeysRequest {
-repeated KeyArgs keyArgs = 1;
+optional DeleteKeyArgs deleteKeys = 1;

Review comment:
   Should this be required instead of optional?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #1163: HDDS-3920. Too many redudant replications due to fail to get node's a…

2020-07-09 Thread GitBox


xiaoyuyao commented on pull request #1163:
URL: https://github.com/apache/hadoop-ozone/pull/1163#issuecomment-656470926


   LGTM, +1 pending CI. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1444) Allocate block fails in MiniOzoneChaosCluster because of InsufficientDatanodesException

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1444:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Allocate block fails in MiniOzoneChaosCluster because of 
> InsufficientDatanodesException
> ---
>
> Key: HDDS-1444
> URL: https://issues.apache.org/jira/browse/HDDS-1444
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: TriagePending
>
> MiniOzoneChaosCluster is failing with InsufficientDatanodesException  while 
> writing keys to the Ozone Cluster
> {code}
> org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot 
> create pipeline of factor 3 using 2 nodes.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1163: HDDS-3920. Too many redudant replications due to fail to get node's a…

2020-07-09 Thread GitBox


xiaoyuyao commented on a change in pull request #1163:
URL: https://github.com/apache/hadoop-ozone/pull/1163#discussion_r452609872



##
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
##
@@ -102,8 +102,15 @@ public ContainerReportHandler(final NodeManager 
nodeManager,
   public void onMessage(final ContainerReportFromDatanode reportFromDatanode,
 final EventPublisher publisher) {
 
-final DatanodeDetails datanodeDetails =
+final DatanodeDetails dnFromReport =
 reportFromDatanode.getDatanodeDetails();
+DatanodeDetails datanodeDetails =

Review comment:
   That's a good catch. Thanks for the details. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2039) Some ozone unit test takes too long to finish.

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2039:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> Some ozone unit test takes too long to finish.
> --
>
> Key: HDDS-2039
> URL: https://issues.apache.org/jira/browse/HDDS-2039
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Xiaoyu Yao
>Priority: Major
>
> Here are a few 
> {code}
> [INFO] Running org.apache.hadoop.ozone.om.TestOzoneManagerHA
> [INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 436.08 s - in org.apache.hadoop.ozone.om.TestOzoneManagerHA
> [INFO] Running org.apache.hadoop.ozone.om.TestOzoneManager
> [INFO] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 259.566 s - in org.apache.hadoop.ozone.om.TestOzoneManager
> [INFO] Running org.apache.hadoop.ozone.om.TestScmSafeMode
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 129.653 s - in org.apache.hadoop.ozone.om.TestScmSafeMode
> [INFO] Running org.apache.hadoop.ozone.om.TestOzoneManagerRestart
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 843.129 s - in org.apache.hadoop.ozone.om.TestOzoneManagerRestart
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2643) TestOzoneDelegationTokenSecretManager#testRenewTokenFailureRenewalTime fails intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2643:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> TestOzoneDelegationTokenSecretManager#testRenewTokenFailureRenewalTime fails 
> intermittently
> ---
>
> Key: HDDS-2643
> URL: https://issues.apache.org/jira/browse/HDDS-2643
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lokesh Jain
>Priority: Major
>  Labels: TriagePending
>
> TestOzoneDelegationTokenSecretManager.testRenewTokenFailureRenewalTime fails 
> intermittently with the following error.
> {code:java}
> [ERROR] Failures: 
> [ERROR]   
> TestOzoneDelegationTokenSecretManager.testRenewTokenFailureRenewalTime:253 
> Expecting java.io.IOException with text is expired but got :  Expected to 
> find 'is expired' but got unexpected exception: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: token 
> (OzoneToken owner=testUser, renewer=testUser, realUser=testUser, 
> issueDate=1574938955794, maxDate=1574938965794, sequenceNumber=1, 
> masterKeyId=1, strToSign=null, signature=null, awsAccessKeyId=null) can't be 
> found in cache
> at 
> org.apache.hadoop.ozone.security.OzoneDelegationTokenSecretManager.validateToken(OzoneDelegationTokenSecretManager.java:362)
> at 
> org.apache.hadoop.ozone.security.OzoneDelegationTokenSecretManager.renewToken(OzoneDelegationTokenSecretManager.java:244)
> at 
> org.apache.hadoop.ozone.security.TestOzoneDelegationTokenSecretManager.lambda$testRenewTokenFailureRenewalTime$2(TestOzoneDelegationTokenSecretManager.java:254)
> at 
> org.apache.hadoop.test.LambdaTestUtils.lambda$intercept$0(LambdaTestUtils.java:527)
> at 
> org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:491)
> at 
> org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:522)
> at 
> org.apache.hadoop.ozone.security.TestOzoneDelegationTokenSecretManager.testRenewTokenFailureRenewalTime(TestOzoneDelegationTokenSecretManager.java:253)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (HDDS-2644) TestTableCacheImpl#testPartialTableCacheWithOverrideAndDelete fails intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2644:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> TestTableCacheImpl#testPartialTableCacheWithOverrideAndDelete fails 
> intermittently
> --
>
> Key: HDDS-2644
> URL: https://issues.apache.org/jira/browse/HDDS-2644
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lokesh Jain
>Priority: Major
>  Labels: TriagePending
>
> {code:java}
> [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.87 
> s <<< FAILURE! - in org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl
> [ERROR] 
> testPartialTableCacheWithOverrideAndDelete[0](org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl)
>   Time elapsed: 0.044 s  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<6>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.hdds.utils.db.cache.TestTableCacheImpl.testPartialTableCacheWithOverrideAndDelete(TestTableCacheImpl.java:308)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runners.Suite.runChild(Suite.java:127)
> at org.junit.runners.Suite.runChild(Suite.java:26)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2649) TestOzoneManagerHttpServer#testHttpPolicy fails intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2649:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> TestOzoneManagerHttpServer#testHttpPolicy fails intermittently
> --
>
> Key: HDDS-2649
> URL: https://issues.apache.org/jira/browse/HDDS-2649
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lokesh Jain
>Priority: Major
>  Labels: TriagePending
>
> TestOzoneManagerHttpServer#testHttpPolicy fails with the following exception.
> {code:java}
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.42 
> s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer
> [ERROR] 
> testHttpPolicy[1](org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer)  
> Time elapsed: 0.343 s  <<< FAILURE!
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer.testHttpPolicy(TestOzoneManagerHttpServer.java:110)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runners.Suite.runChild(Suite.java:127)
> at org.junit.runners.Suite.runChild(Suite.java:26)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1249) Fix TestOzoneManagerHttpServer & TestStorageContainerManagerHttpServer

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1249:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Fix TestOzoneManagerHttpServer & TestStorageContainerManagerHttpServer
> --
>
> Key: HDDS-1249
> URL: https://issues.apache.org/jira/browse/HDDS-1249
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager, SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Nanda kumar
>Priority: Major
>  Labels: TriagePending
>
> Fix the following unit test failures
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdds.scm.TestStorageContainerManagerHttpServer.testHttpPolicy(TestStorageContainerManagerHttpServer.java:114)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> and
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdds.scm.TestStorageContainerManagerHttpServer.testHttpPolicy(TestStorageContainerManagerHttpServer.java:109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> 

[jira] [Updated] (HDDS-1537) TestContainerPersistence#testDeleteBlockTwice is failing

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1537:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Improvement)

> TestContainerPersistence#testDeleteBlockTwice is failing 
> -
>
> Key: HDDS-1537
> URL: https://issues.apache.org/jira/browse/HDDS-1537
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: TriagePending
>
> The test is failing with the following exception.
> {code}
> [ERROR] Tests run: 18, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 4.132 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence
> [ERROR] 
> testDeleteBlockTwice(org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence)
>   Time elapsed: 0.058 s  <<< FAILURE!
> java.lang.AssertionError: Expected test to throw (an instance of 
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException 
> and exception with message a string containing "Unable to find the block.")
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.junit.rules.ExpectedException.failDueToMissingException(ExpectedException.java:184)
>   at 
> org.junit.rules.ExpectedException.access$100(ExpectedException.java:85)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:170)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1342) TestOzoneManagerHA#testOMProxyProviderFailoverOnConnectionFailure fails intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1342:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestOzoneManagerHA#testOMProxyProviderFailoverOnConnectionFailure fails 
> intermittently
> --
>
> Key: HDDS-1342
> URL: https://issues.apache.org/jira/browse/HDDS-1342
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lokesh Jain
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: TriagePending
>
> The test fails intermittently. The link to the test report can be found below.
> [https://builds.apache.org/job/PreCommit-HDDS-Build/2582/testReport/]
> {code:java}
> java.net.ConnectException: Call From ea902c1cb730/172.17.0.3 to 
> localhost:10174 failed on connection exception: java.net.ConnectException: 
> Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1457)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1367)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy34.submitRequest(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy34.submitRequest(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>   at com.sun.proxy.$Proxy34.submitRequest(Unknown Source)
>   at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.submitRequest(OzoneManagerProtocolClientSideTranslatorPB.java:310)
>   at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.createVolume(OzoneManagerProtocolClientSideTranslatorPB.java:343)
>   at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.createVolume(RpcClient.java:275)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.ozone.client.OzoneClientInvocationHandler.invoke(OzoneClientInvocationHandler.java:54)
>   at com.sun.proxy.$Proxy86.createVolume(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>   at com.sun.proxy.$Proxy86.createVolume(Unknown Source)
>   at 
> 

[jira] [Updated] (HDDS-1316) TestContainerStateManagerIntegration#testReplicaMap fails with ChillModePrecheck

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1316:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestContainerStateManagerIntegration#testReplicaMap fails with 
> ChillModePrecheck
> 
>
> Key: HDDS-1316
> URL: https://issues.apache.org/jira/browse/HDDS-1316
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: Triaged
>
> TestContainerStateManagerIntegration#testReplicaMap fails with 
> ChillModePrecheck
> {code}
> [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 
> 41.475 s <<< FAILURE! - in 
> org.apache.hadoop.hdds.scm.container.TestContainerStateManagerIntegration
> [ERROR] 
> testReplicaMap(org.apache.hadoop.hdds.scm.container.TestContainerStateManagerIntegration)
>   Time elapsed: 4.589 s  <<< ERROR!
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed 
> for allocateContainer
>   at 
> org.apache.hadoop.hdds.scm.chillmode.ChillModePrecheck.check(ChillModePrecheck.java:51)
>   at 
> org.apache.hadoop.hdds.scm.chillmode.ChillModePrecheck.check(ChillModePrecheck.java:31)
>   at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:53)
>   at 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.allocateContainer(SCMClientProtocolServer.java:180)
>   at 
> org.apache.hadoop.hdds.scm.container.TestContainerStateManagerIntegration.testReplicaMap(TestContainerStateManagerIntegration.java:386)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2934) OM HA S3 test failure

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2934:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> OM HA S3 test failure
> -
>
> Key: HDDS-2934
> URL: https://issues.apache.org/jira/browse/HDDS-2934
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Attila Doroszlai
>Priority: Major
>  Labels: Triaged, intermittent
> Attachments: docker-ozone-om-ha-s3-ozone-om-ha-s3-s3-scm.log, 
> robot-ozone-om-ha-s3-ozone-om-ha-s3-s3-scm.xml
>
>
> OM HA S3 test ({{ozone-om-ha-s3}}) failed in one CI run at the following test 
> case, then most subsequent test cases failed, too:
> {code}
> 2020-01-22T06:33:16.7322540Z Test Multipart Upload Put With Copy and range
>  | FAIL |
> 2020-01-22T06:33:16.7323058Z 255 != 0
> {code}
> Docker log has several of the following exception starting around above time:
> {code}
> OMNotLeaderException: OM:om1 is not the leader. Suggested leader is OM:om3.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #1174: HDDS-3918. ConcurrentModificationException in ContainerReportHandler.…

2020-07-09 Thread GitBox


xiaoyuyao commented on pull request #1174:
URL: https://github.com/apache/hadoop-ozone/pull/1174#issuecomment-656465017


   Thanks @adoroszlai for the review and @ChenSammi for reporting the issue. PR 
has been merged. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3918) ConcurrentModificationException in ContainerReportHandler.onMessage

2020-07-09 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDDS-3918.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

> ConcurrentModificationException in ContainerReportHandler.onMessage
> ---
>
> Key: HDDS-3918
> URL: https://issues.apache.org/jira/browse/HDDS-3918
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Sammi Chen
>Assignee: Xiaoyu Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: TestCME.java
>
>
> 2020-07-03 14:51:45,489 [EventQueue-ContainerReportForContainerReportHandler] 
> ERROR org.apache.hadoop.hdds.server.events.SingleThreadExecutor: Error on 
> execution message 
> org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher$ContainerReportFromDatanode@8f6e7cb
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
> at java.util.HashMap$KeyIterator.next(HashMap.java:1469)
> at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:127)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-07-03 14:51:45,648 [EventQueue-ContainerReportForContainerReportHandler] 
> ERROR org.apache.hadoop.hdds.server.events.SingleThreadExecutor: Error on 
> execution message 
> org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher$ContainerReportFromDatanode@49d2b84b
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
> at java.util.HashMap$KeyIterator.next(HashMap.java:1469)
> at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:127)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao merged pull request #1174: HDDS-3918. ConcurrentModificationException in ContainerReportHandler.…

2020-07-09 Thread GitBox


xiaoyuyao merged pull request #1174:
URL: https://github.com/apache/hadoop-ozone/pull/1174


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1166: HDDS-3914. Remove LevelDB configuration option for DN Metastore

2020-07-09 Thread GitBox


xiaoyuyao commented on a change in pull request #1166:
URL: https://github.com/apache/hadoop-ozone/pull/1166#discussion_r452605910



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
##
@@ -186,8 +186,8 @@ private void checkContainerFile() throws IOException {
 }
 
 dbType = onDiskContainerData.getContainerDBType();
-if (!dbType.equals(OZONE_METADATA_STORE_IMPL_ROCKSDB) &&
-!dbType.equals(OZONE_METADATA_STORE_IMPL_LEVELDB)) {
+if (!dbType.equals(CONTAINER_DB_TYPE_ROCKSDB) &&

Review comment:
   That makes sense to me. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1681) TestNodeReportHandler failing because of NPE

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1681:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestNodeReportHandler failing because of NPE
> 
>
> Key: HDDS-1681
> URL: https://issues.apache.org/jira/browse/HDDS-1681
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Mukul Kumar Singh
>Assignee: Nanda kumar
>Priority: Major
>  Labels: TriagePending
>
> {code}
> [INFO] Running org.apache.hadoop.hdds.scm.node.TestNodeReportHandler
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.469 
> s <<< FAILURE! - in org.apache.hadoop.hdds.scm.node.TestNodeReportHandler
> [ERROR] testNodeReport(org.apache.hadoop.hdds.scm.node.TestNodeReportHandler) 
>  Time elapsed: 0.31 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:122)
>   at 
> org.apache.hadoop.hdds.scm.node.TestNodeReportHandler.resetEventCollector(TestNodeReportHandler.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2011) TestRandomKeyGenerator fails due to timeout

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2011:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestRandomKeyGenerator fails due to timeout
> ---
>
> Key: HDDS-2011
> URL: https://issues.apache.org/jira/browse/HDDS-2011
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
> Attachments: 
> org.apache.hadoop.ozone.freon.TestRandomKeyGenerator-output.txt
>
>
> {{TestRandomKeyGenerator#bigFileThan2GB}} is failing intermittently due to 
> timeout in Ratis {{appendEntries}}.  Commit on pipeline fails, and new 
> pipeline cannot be created with 2 nodes (there are 5 nodes total).
> Most recent one: 
> https://github.com/elek/ozone-ci/tree/master/trunk/trunk-nightly-pz9vg/integration/hadoop-ozone/tools



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1343) TestNodeFailure times out intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1343:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestNodeFailure times out intermittently
> 
>
> Key: HDDS-1343
> URL: https://issues.apache.org/jira/browse/HDDS-1343
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lokesh Jain
>Priority: Major
>  Labels: TriagePending
>
> TestNodeFailure times out while waiting for cluster to be ready. This is done 
> in cluster setup.
> {code:java}
> java.lang.Thread.State: WAITING (on object monitor)
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>   at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:389)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.waitForClusterToBeReady(MiniOzoneClusterImpl.java:140)
>   at 
> org.apache.hadoop.hdds.scm.pipeline.TestNodeFailure.init(TestNodeFailure.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> 5 datanodes out of 6 are able to heartbeat in the test result 
> [https://builds.apache.org/job/PreCommit-HDDS-Build/2582/testReport/].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2083) Fix TestQueryNode#testStaleNodesCount

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2083:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Fix TestQueryNode#testStaleNodesCount
> -
>
> Key: HDDS-2083
> URL: https://issues.apache.org/jira/browse/HDDS-2083
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Dinesh Chitlangia
>Priority: Major
>  Labels: TriagePending
> Attachments: stacktrace.rtf
>
>
> It appears this test is failing due to several threads in waiting state.
> Attached complete stack trace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1960) TestMiniChaosOzoneCluster may run until OOME

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1960:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestMiniChaosOzoneCluster may run until OOME
> 
>
> Key: HDDS-1960
> URL: https://issues.apache.org/jira/browse/HDDS-1960
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Critical
>  Labels: MiniOzoneChaosCluster
>
> {{TestMiniChaosOzoneCluster}} runs load generator on a cluster for supposedly 
> 1 minute, but it may run indefinitely until JVM crashes due to 
> OutOfMemoryError.
> In 0.4.1 nightly build it crashed 29/30 times (and no tests were executed in 
> the remaining one run due to some other error).
> Latest:
> https://github.com/elek/ozone-ci/blob/3f553ed6ad358ba61a302967617de737d7fea01a/byscane/byscane-nightly-wggqd/integration/output.log#L5661-L5662
> When it crashes, it leaves GBs of data lying around.
> HDDS-1952 disabled this test in CI runs.  It can still be run manually (eg. 
> {{mvn -Phdds -pl :hadoop-ozone-integration-test 
> -Dtest=TestMiniChaosOzoneCluster test}}).  The goal of this task is to 
> investigate the root cause of the runaway nature of this test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2085) TestBlockManager#testMultipleBlockAllocationWithClosedContainer timed out

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2085:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestBlockManager#testMultipleBlockAllocationWithClosedContainer timed out
> -
>
> Key: HDDS-2085
> URL: https://issues.apache.org/jira/browse/HDDS-2085
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Dinesh Chitlangia
>Priority: Major
>  Labels: TriagePending, ozone-flaky-test
>
> {code:java}
> ---
> Test set: org.apache.hadoop.hdds.scm.block.TestBlockManager
> ---
> Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 7.697 s <<< 
> FAILURE! - in org.apache.hadoop.hdds.scm.block.TestBlockManager
> testMultipleBlockAllocationWithClosedContainer(org.apache.hadoop.hdds.scm.block.TestBlockManager)
>   Time elapsed: 3.619 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. Thread diagnostics:
> Timestamp: 2019-09-03 08:46:46,870
> "Socket Reader #1 for port 32840"  prio=5 tid=14 runnable
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
> at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1097)
> at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1076)
> "Socket Reader #1 for port 43576"  prio=5 tid=22 runnable
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
> at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1097)
> at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1076)
> "surefire-forkedjvm-command-thread" daemon prio=5 tid=8 runnable
> java.lang.Thread.State: RUNNABLE
> at java.io.FileInputStream.readBytes(Native Method)
> at java.io.FileInputStream.read(FileInputStream.java:255)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.maven.surefire.booter.MasterProcessCommand.decode(MasterProcessCommand.java:115)
> at 
> org.apache.maven.surefire.booter.CommandReader$CommandRunnable.run(CommandReader.java:390)
> at java.lang.Thread.run(Thread.java:748)
> "surefire-forkedjvm-ping-30s" daemon prio=5 tid=9 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "Thread-15" daemon prio=5 tid=30 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hdds.scm.safemode.SafeModeHandler.lambda$onMessage$0(SafeModeHandler.java:114)
> at 
> org.apache.hadoop.hdds.scm.safemode.SafeModeHandler$$Lambda$33/1541519391.run(Unknown
>  Source)
> at java.lang.Thread.run(Thread.java:748)
> "process reaper" daemon prio=10 tid=10 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> 

[jira] [Updated] (HDDS-1936) ozonesecure s3 test fails intermittently

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1936:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> ozonesecure s3 test fails intermittently
> 
>
> Key: HDDS-1936
> URL: https://issues.apache.org/jira/browse/HDDS-1936
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>  Labels: TriagePending
>
> Sometimes acceptance tests fail at ozonesecure s3 test, starting with:
> {code:title=https://ci.anzix.net/job/ozone/17607/artifact/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/result/log.html#s1-s18-s1-t1-k3-k1-k2}
> Completed 29 Bytes/29 Bytes (6 Bytes/s) with 1 file(s) remaining
> upload failed: ../../tmp/testfile to s3://bucket-07853/testfile An error 
> occurred (500) when calling the PutObject operation (reached max retries: 4): 
> Internal Server Error
> {code}
> followed by:
> {code:title=https://ci.anzix.net/job/ozone/17607/artifact/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/result/log.html#s1-s18-s5-t1}
> ('Connection aborted.', error(32, 'Broken pipe'))
> {code}
> in subsequent test cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2263) Intermittent failure in TestOzoneContainer#testContainerCreateDiskFull

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2263:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in TestOzoneContainer#testContainerCreateDiskFull
> --
>
> Key: HDDS-2263
> URL: https://issues.apache.org/jira/browse/HDDS-2263
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Minor
>
> {code:title=https://github.com/elek/ozone-ci-q4/blob/9bdc8cbd50e9a46c193da288ebea74de8aaea094/pr/pr-hdds-2239-kl4xt/unit/hadoop-hdds/container-service/org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer.txt#L4-L11}
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.246 s <<< 
> FAILURE! - in org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer
> testContainerCreateDiskFull(org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer)
>   Time elapsed: 11.562 s  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
> ...
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer.testContainerCreateDiskFull(TestOzoneContainer.java:176)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2961) Intermittent failure in TestSCMContainerPlacementPolicyMetrics

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2961:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in TestSCMContainerPlacementPolicyMetrics
> --
>
> Key: HDDS-2961
> URL: https://issues.apache.org/jira/browse/HDDS-2961
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>
> TestSCMContainerPlacementPolicyMetrics fails intermittently, probably due to 
> usage of {{Thread.sleep}}.
> {code:title=https://github.com/adoroszlai/hadoop-ozone/runs/417318332}
> 2020-01-30T14:46:16.2217714Z [INFO] Running 
> org.apache.hadoop.ozone.scm.TestSCMContainerPlacementPolicyMetrics
> 2020-01-30T14:47:46.5509550Z [ERROR] Tests run: 1, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 90.285 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.scm.TestSCMContainerPlacementPolicyMetrics
> 2020-01-30T14:47:46.5532268Z [ERROR] 
> test(org.apache.hadoop.ozone.scm.TestSCMContainerPlacementPolicyMetrics)  
> Time elapsed: 90.147 s  <<< ERROR!
> 2020-01-30T14:47:46.5584246Z java.lang.Exception: test timed out after 6 
> milliseconds
> 2020-01-30T14:47:46.5588764Z  at java.lang.Thread.sleep(Native Method)
> 2020-01-30T14:47:46.5605748Z  at 
> org.apache.hadoop.ozone.scm.TestSCMContainerPlacementPolicyMetrics.test(TestSCMContainerPlacementPolicyMetrics.java:135)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2986) MiniOzoneChaosCluster exits because of datanode shutdown

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2986:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> MiniOzoneChaosCluster exits because of datanode shutdown
> 
>
> Key: HDDS-2986
> URL: https://issues.apache.org/jira/browse/HDDS-2986
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: MiniOzoneChaosCluster
> Attachments: HDDS-2986.test.patch
>
>
> MiniOzoneChaosCluster exits because of datanode shutdown
> {code}
> 2020-02-06 18:50:50,760 [Datanode State Machine Thread - 0] ERROR 
> statemachine.DatanodeStateMachine (DatanodeStateMachine.java:start(219)) - 
> DatanodeStateMachine Shutdown due to an critical error
> 2020-02-06 18:50:50,772 [RatisApplyTransactionExecutor 0] INFO  
> interfaces.Container (KeyValueContainer.java:flushAndSyncDB(400)) - Container 
> 30 is synced with bcsId 7490.
> 2020-02-06 18:50:50,774 [RatisApplyTransactionExecutor 0] INFO  
> interfaces.Container (KeyValueContainer.java:close(338)) - Container 30 is 
> closed with bcsId 7490.
> 2020-02-06 18:50:50,774 [Datanode State Machine Thread - 0] ERROR 
> report.ReportManager (ReportManager.java:shutdown(82)) - Failed to shutdown 
> Report Manager
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
> at 
> org.apache.hadoop.ozone.container.common.report.ReportManager.shutdown(ReportManager.java:80)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:411)
> at 
> org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:474)
> at 
> org.apache.hadoop.ozone.HddsDatanodeService.terminateDatanode(HddsDatanodeService.java:454)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:220)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:365)
> at java.lang.Thread.run(Thread.java:748)
> 2020-02-06 18:50:50,774 [Datanode State Machine Thread - 0] ERROR 
> statemachine.DatanodeStateMachine (DatanodeStateMachine.java:close(272)) - 
> Error attempting to shutdown.
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2067)
> at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:264)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:412)
> at 
> org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:474)
> at 
> org.apache.hadoop.ozone.HddsDatanodeService.terminateDatanode(HddsDatanodeService.java:454)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:220)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:365)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3039) SCM sometimes cannot exit safe mode

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3039:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> SCM sometimes cannot exit safe mode
> ---
>
> Key: HDDS-3039
> URL: https://issues.apache.org/jira/browse/HDDS-3039
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Attila Doroszlai
>Priority: Critical
>  Labels: Triaged
>
> Sometimes SCM cannot exit safe mode:
> {code:title=https://github.com/apache/hadoop-ozone/pull/563/checks?check_run_id=453543576}
> 2020-02-18T19:12:28.1108180Z [ERROR] Tests run: 1, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 139.821 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.fsck.TestContainerMapper
> 2020-02-18T19:12:28.1169327Z [ERROR] 
> org.apache.hadoop.ozone.fsck.TestContainerMapper  Time elapsed: 139.813 s  
> <<< ERROR!
> 2020-02-18T19:12:28.1202534Z java.util.concurrent.TimeoutException: 
> ...
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.waitForClusterToBeReady(MiniOzoneClusterImpl.java:164)
>   at 
> org.apache.hadoop.ozone.fsck.TestContainerMapper.init(TestContainerMapper.java:71)
> {code}
> despite nodes and pipeline being ready:
> {code}
> 2020-02-18 19:10:18,045 [main] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:lambda$waitForClusterToBeReady$0(169)) - Nodes are 
> ready. Got 3 of 3 DN Heartbeats.
> ...
> 2020-02-18 19:10:18,847 [RatisPipelineUtilsThread] INFO  
> pipeline.PipelineStateManager (PipelineStateManager.java:addPipeline(54)) - 
> Created pipeline Pipeline[ Id: b56478a3-8816-459e-a007-db5ee4a5572e, Nodes: 
> 86e97873-2dbd-4f1b-b418-cf9fba405476{ip: 172.17.0.2, host: bedb6e0ff851, 
> networkLocation: /default-rack, certSerialId: 
> null}0fb407c1-4cda-4b3e-8e64-20c845872684{ip: 172.17.0.2, host: bedb6e0ff851, 
> networkLocation: /default-rack, certSerialId: 
> null}31baa82d-441c-41be-94c9-8dd7468b728e{ip: 172.17.0.2, host: bedb6e0ff851, 
> networkLocation: /default-rack, certSerialId: null}, Type:RATIS, 
> Factor:THREE, State:ALLOCATED, leaderId:null ]
> ...
> 2020-02-18 19:12:17,108 [main] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:lambda$waitForClusterToBeReady$0(169)) - Nodes are 
> ready. Got 3 of 3 DN Heartbeats.
> 2020-02-18 19:12:17,108 [main] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:lambda$waitForClusterToBeReady$0(172)) - Waiting 
> for cluster to exit safe mode
> 2020-02-18 19:12:17,151 [main] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:shutdown(370)) - Shutting down the Mini Ozone 
> Cluster
> {code}
> [~shashikant] also noticed this in other integration tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3131) TestMiniChaosOzoneCluster timeout

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3131:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestMiniChaosOzoneCluster timeout
> -
>
> Key: HDDS-3131
> URL: https://issues.apache.org/jira/browse/HDDS-3131
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Critical
> Attachments: unit (1).zip, unit (2).zip
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TestMiniChaosOzoneCluster times out in CI runs rather frequently:
> https://github.com/apache/hadoop-ozone/runs/486890736
> https://github.com/apache/hadoop-ozone/runs/486890004
> https://github.com/apache/hadoop-ozone/runs/486836962
> Logs are available in "unit" artifacts.
> CC [~msingh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3145) Intermittent failure in TestEndPoint#testGetVersionAssertRpcTimeOut

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3145:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in TestEndPoint#testGetVersionAssertRpcTimeOut
> ---
>
> Key: HDDS-3145
> URL: https://issues.apache.org/jira/browse/HDDS-3145
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>
> {{TestEndPoint#testGetVersionAssertRpcTimeOut}} is intermittently failing due 
> to too slow response:
> {code:title=https://github.com/adoroszlai/hadoop-ozone/runs/495215641}
> [ERROR] Tests run: 15, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 6.898 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.container.common.TestEndPoint
> 2020-03-09T13:57:54.6010903Z [ERROR] 
> testGetVersionAssertRpcTimeOut(org.apache.hadoop.ozone.container.common.TestEndPoint)
>   Time elapsed: 1.168 s  <<< FAILURE!
> Expected: a value less than or equal to <1100L>
>  but: <1107L> was greater than <1100L>
>   ...
>   at 
> org.apache.hadoop.ozone.container.common.TestEndPoint.testGetVersionAssertRpcTimeOut(TestEndPoint.java:261)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3146) Intermittent timeout in TestOzoneRpcClient

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3146:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent timeout in TestOzoneRpcClient
> --
>
> Key: HDDS-3146
> URL: https://issues.apache.org/jira/browse/HDDS-3146
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClient-output.txt
>
>
> {code:title=https://github.com/apache/hadoop-ozone/runs/495197228}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M1:test (default-test) 
> on project hadoop-ozone-integration-test: There was a timeout or other error 
> in the fork
> ...
> org.apache.hadoop.ozone.client.rpc.TestOzoneRpcClient
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3265) Intermittent timeout in TestRatisPipelineLeader

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3265:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent timeout in TestRatisPipelineLeader
> ---
>
> Key: HDDS-3265
> URL: https://issues.apache.org/jira/browse/HDDS-3265
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: 
> TEST-org.apache.hadoop.hdds.scm.TestRatisPipelineLeader.xml, 
> org.apache.hadoop.hdds.scm.TestRatisPipelineLeader-output.txt, 
> org.apache.hadoop.hdds.scm.TestRatisPipelineLeader-output.txt, 
> org.apache.hadoop.hdds.scm.TestRatisPipelineLeader.txt
>
>
> TestRatisPipelineLeader sometimes times out waiting to create container:
> {code:title=https://github.com/apache/hadoop-ozone/runs/527779039}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 174.712 s <<< FAILURE! - in org.apache.hadoop.hdds.scm.TestRatisPipelineLeader
> [ERROR] 
> testLeaderIdUsedOnFirstCall(org.apache.hadoop.hdds.scm.TestRatisPipelineLeader)
>   Time elapsed: 120.01 s  <<< ERROR!
> java.lang.Exception: test timed out after 12 milliseconds
> ...
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientSpi.sendCommand(XceiverClientSpi.java:134)
>   at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.createContainer(ContainerProtocolCalls.java:406)
>   at 
> org.apache.hadoop.hdds.scm.TestRatisPipelineLeader.testLeaderIdUsedOnFirstCall(TestRatisPipelineLeader.java:100)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3950) FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3950:
-
Target Version/s: 0.7.0

> FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout
> ---
>
> Key: HDDS-3950
> URL: https://issues.apache.org/jira/browse/HDDS-3950
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Priority: Major
>
> {code:java}
> [INFO] Running 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO]
>  Running 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR]
>  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR]
>  
> testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
>   Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3841) Stale tests(timeout or other reason)

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3841:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> Stale tests(timeout or other reason)
> 
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3841) Stale tests(timeout or other reason)

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Summary: Stale tests(timeout or other reason)  (was: FLAKY-UT: 
TestSecureOzoneRpcClient timeout)

> Stale tests(timeout or other reason)
> 
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3950) FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3950:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Improvement)

> FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout
> ---
>
> Key: HDDS-3950
> URL: https://issues.apache.org/jira/browse/HDDS-3950
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Priority: Major
>
> {code:java}
> [INFO] Running 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO]
>  Running 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR]
>  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR]
>  
> testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
>   Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3950) FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout

2020-07-09 Thread maobaolong (Jira)
maobaolong created HDDS-3950:


 Summary: FLAKY-UT: TestFailureHandlingByClientFlushDelay timeout
 Key: HDDS-3950
 URL: https://issues.apache.org/jira/browse/HDDS-3950
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: test
Affects Versions: 0.6.0
Reporter: maobaolong


{code:java}
[INFO] Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[INFO] 
Running 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.969 s <<< 
FAILURE! - in 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay[ERROR] 
testPipelineExclusionWithPipelineFailure(org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay)
  Time elapsed: 94.881 s  <<< FAILURE!java.lang.AssertionError at 
org.junit.Assert.fail(Assert.java:86) at 
org.junit.Assert.assertTrue(Assert.java:41) at 
org.junit.Assert.assertTrue(Assert.java:52) at 
org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClientFlushDelay.testPipelineExclusionWithPipelineFailure(TestFailureHandlingByClientFlushDelay.java:200)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3907) Topology related acceptance test is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3907:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Topology related acceptance test is flaky
> -
>
> Key: HDDS-3907
> URL: https://issues.apache.org/jira/browse/HDDS-3907
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Marton Elek
>Priority: Blocker
>
> Examples:
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1318/acceptance
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1321/acceptance
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1334/acceptance
> Some strange errors:
> {code}
> scm_1 | 2020-06-30 19:17:50,787 [RatisPipelineUtilsThread] ERROR 
> pipeline.SCMPipelineManager: Failed to create pipeline of type RATIS and 
> factor ONE. Exception: Cannot create pipeline of factor 1 using 0 nodes. Used 
> 6 nodes. Healthy nodes 6
> scm_1 | 2020-06-30 19:17:50,788 [RatisPipelineUtilsThread] ERROR 
> pipeline.SCMPipelineManager: Failed to create pipeline of type RATIS and 
> factor THREE. Exception: Pipeline creation failed because nodes are engaged 
> in other pipelines and every node can only be engaged in max 2 pipelines. 
> Required 3. Found 0
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1435) ITestOzoneContractOpen.testOpenFileTwice is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1435:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> ITestOzoneContractOpen.testOpenFileTwice is flaky
> -
>
> Key: HDDS-1435
> URL: https://issues.apache.org/jira/browse/HDDS-1435
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Marton Elek
>Priority: Major
>  Labels: TriagePending, ozone-flaky-test
> Attachments: ci.log, ci2.log
>
>
> It is failed with:
> {code}
> java.lang.Exception: test timed out after 3 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
>   at 
> org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineUtils.waitForPipelines(TestRatisPipelineUtils.java:125)
>   at 
> org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineUtils.testPipelineCreationOnNodeRestart(TestRatisPipelineUtils.java:120)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> Log file is uploaded.
> AFAIK the problem is that RatisPipelineUtils.createPipeline connected to a 
> non-leader node.
> But it's not clear for me how is it possible. According to the log (attached) 
> the leader election continued after the NonLeaderException. It can be just 
> the sign that one datanode was missing from the pipeline...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3841) FLAKY-UT: TestSecureOzoneRpcClient timeout

2020-07-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3841:
-
Summary: FLAKY-UT: TestSecureOzoneRpcClient timeout  (was: Stale 
tests(timeout or other reason))

> FLAKY-UT: TestSecureOzoneRpcClient timeout
> --
>
> Key: HDDS-3841
> URL: https://issues.apache.org/jira/browse/HDDS-3841
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Major
>
> If a failure test appeared in your CI checks, and you are sure it is not 
> relation with your PR, so, paste the stale test log here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1967) TestBlockOutputStreamWithFailures is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1967:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestBlockOutputStreamWithFailures is flaky
> --
>
> Key: HDDS-1967
> URL: https://issues.apache.org/jira/browse/HDDS-1967
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Nanda kumar
>Priority: Major
>  Labels: TriagePending, ozone-flaky-test
> Attachments: 
> TEST-org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.xml,
>  
> org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures-output.txt,
>  org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.txt
>
>
> {{TestBlockOutputStreamWithFailures}} is flaky. 
> {noformat}
> [ERROR] 
> test2DatanodesFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures)
>   Time elapsed: 23.816 s  <<< FAILURE!
> java.lang.AssertionError: expected:<4> but was:<8>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.test2DatanodesFailure(TestBlockOutputStreamWithFailures.java:425)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}
> {noformat}
> [ERROR] 
> testWatchForCommitDatanodeFailure(org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures)
>   Time elapsed: 30.895 s  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures.testWatchForCommitDatanodeFailure(TestBlockOutputStreamWithFailures.java:366)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> 

[jira] [Updated] (HDDS-1656) Fix flaky test TestWatchForCommit#testWatchForCommitWithKeyWrite

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1656:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Fix flaky test TestWatchForCommit#testWatchForCommitWithKeyWrite
> 
>
> Key: HDDS-1656
> URL: https://issues.apache.org/jira/browse/HDDS-1656
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Ajay Kumar
>Priority: Major
>  Labels: TriagePending
>
> TestWatchForCommit#testWatchForCommitWithKeyWrite behavior is not consistent. 
> In multiple runs sometimes it passes while on other occasions it fails with 
> different assertion/DN related errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1409) TestOzoneClientRetriesOnException is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1409:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Test)

> TestOzoneClientRetriesOnException is flaky
> --
>
> Key: HDDS-1409
> URL: https://issues.apache.org/jira/browse/HDDS-1409
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Nanda kumar
>Priority: Major
>  Labels: TriagePending, ozone-flaky-test
>
> TestOzoneClientRetriesOnException is flaky, we get the below exception when 
> it fails.
> {noformat}
> [ERROR] 
> testMaxRetriesByOzoneClient(org.apache.hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException)
>   Time elapsed: 16.227 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException.testMaxRetriesByOzoneClient(TestOzoneClientRetriesOnException.java:197)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2082) Fix flaky TestContainerStateMachineFailures#testApplyTransactionFailure

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2082:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Fix flaky TestContainerStateMachineFailures#testApplyTransactionFailure
> ---
>
> Key: HDDS-2082
> URL: https://issues.apache.org/jira/browse/HDDS-2082
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Dinesh Chitlangia
>Priority: Major
>  Labels: TriagePending, ozone-flaky-test
>
> {code:java}
> ---
> Test set: org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures
> ---
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 102.615 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures
> testApplyTransactionFailure(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures)
>   Time elapsed: 15.677 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailures.testApplyTransactionFailure(TestContainerStateMachineFailures.java:349)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3151) Intermittent timeout in TestCloseContainerHandlingByClient#testMultiBlockWrites3

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3151:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent timeout in 
> TestCloseContainerHandlingByClient#testMultiBlockWrites3
> 
>
> Key: HDDS-3151
> URL: https://issues.apache.org/jira/browse/HDDS-3151
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient-output.txt,
>  org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient.txt
>
>
> {code:title=https://github.com/apache/hadoop-ozone/runs/495906854}
> Tests run: 8, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 210.963 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient
> testMultiBlockWrites3(org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient)
>   Time elapsed: 108.777 s  <<< ERROR!
> java.util.concurrent.TimeoutException:
> ...
>   at 
> org.apache.hadoop.ozone.container.TestHelper.waitForContainerClose(TestHelper.java:251)
>   at 
> org.apache.hadoop.ozone.container.TestHelper.waitForContainerClose(TestHelper.java:151)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient.waitForContainerClose(TestCloseContainerHandlingByClient.java:342)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient.testMultiBlockWrites3(TestCloseContainerHandlingByClient.java:310)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3313) OM HA acceptance test is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3313:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> OM HA acceptance test is flaky
> --
>
> Key: HDDS-3313
> URL: https://issues.apache.org/jira/browse/HDDS-3313
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Hanisha Koneru
>Priority: Critical
> Attachments: acceptance.zip
>
>
> {{ozone-om-ha}} test is failing intermittently.  Example on master: 
> https://github.com/apache/hadoop-ozone/runs/549544110
> {code:title=failure 1}
> 2020-03-31T19:34:02.3757399Z 
> ==
> 2020-03-31T19:34:02.3762775Z ozone-om-ha-testOMHA :: Smoketest ozone cluster 
> startup   
> 2020-03-31T19:34:02.3763313Z 
> ==
> 2020-03-31T19:34:07.9174050Z Stop Leader OM and Verify Failover   
>  | FAIL |
> 2020-03-31T19:34:07.9174675Z 255 != 0
> 2020-03-31T19:34:07.9176048Z 
> --
> 2020-03-31T19:34:37.4682717Z Test Multiple Failovers  
>  | FAIL |
> 2020-03-31T19:34:37.4682899Z 1 != 0
> 2020-03-31T19:34:37.4683766Z 
> --
> 2020-03-31T19:35:24.9569154Z Restart OM and Verify Ratis Logs 
>  | FAIL |
> 2020-03-31T19:35:24.9569529Z 255 != 0
> 2020-03-31T19:35:24.9574925Z 
> --
> 2020-03-31T19:35:24.9575613Z ozone-om-ha-testOMHA :: Smoketest ozone cluster 
> startup   | FAIL |
> 2020-03-31T19:35:24.9575952Z 3 critical tests, 0 passed, 3 failed
> 2020-03-31T19:35:24.9576076Z 3 tests total, 0 passed, 3 failed
> {code}
> {code:title=failure 2}
> 2020-03-31T20:36:29.5715868Z 
> ==
> 2020-03-31T20:36:29.5743517Z ozone-om-ha-testOMHA :: Smoketest ozone cluster 
> startup   
> 2020-03-31T20:36:29.5744025Z 
> ==
> 2020-03-31T20:37:08.4625840Z Stop Leader OM and Verify Failover   
>  | PASS |
> 2020-03-31T20:37:08.4626644Z 
> --
> 2020-03-31T20:39:47.9721513Z Test Multiple Failovers  
>  | PASS |
> 2020-03-31T20:39:47.9723424Z 
> --
> 2020-03-31T21:25:29.1203036Z Restart OM and Verify Ratis Logs 
>  | FAIL |
> 2020-03-31T21:25:29.1204001Z Test timeout 8 minutes exceeded.
> 2020-03-31T21:25:29.1204954Z 
> --
> 2020-03-31T21:25:29.1220689Z ozone-om-ha-testOMHA :: Smoketest ozone cluster 
> startup   | FAIL |
> 2020-03-31T21:25:29.1224446Z 3 critical tests, 2 passed, 1 failed
> 2020-03-31T21:25:29.1224833Z 3 tests total, 2 passed, 1 failed
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2760) Intermittent timeout in TestCloseContainerEventHandler

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-2760:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent timeout in TestCloseContainerEventHandler
> --
>
> Key: HDDS-2760
> URL: https://issues.apache.org/jira/browse/HDDS-2760
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Minor
>  Labels: TriagePending, ozone-flaky-test
>
> TestCloseContainerEventHandler depends on wall clock and fails intermittently:
> {code}
> 2019-12-17T11:29:56.1873334Z [INFO] Running 
> org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler
> 2019-12-17T11:31:10.0593259Z [ERROR] Tests run: 4, Failures: 1, Errors: 0, 
> Skipped: 0, Time elapsed: 71.343 s <<< FAILURE! - in 
> org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler
> 2019-12-17T11:31:10.0604096Z [ERROR] 
> testCloseContainerEventWithRatis(org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler)
>   Time elapsed: 66.214 s  <<< FAILURE!
> 2019-12-17T11:31:10.0604347Z java.lang.AssertionError: Messages are not 
> processed in the given timeframe. Queued: 5 Processed: 0
> 2019-12-17T11:31:10.0614937Z  at 
> org.apache.hadoop.hdds.server.events.EventQueue.processAll(EventQueue.java:238)
> 2019-12-17T11:31:10.0616610Z  at 
> org.apache.hadoop.hdds.scm.container.TestCloseContainerEventHandler.testCloseContainerEventWithRatis(TestCloseContainerEventHandler.java:149)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3294) Flaky test TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3294:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Flaky test 
> TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline
> ---
>
> Key: HDDS-3294
> URL: https://issues.apache.org/jira/browse/HDDS-3294
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Siyao Meng
>Priority: Major
>  Labels: TriagePending, flaky-test, ozone-flaky-test
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead-output.txt,
>  org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead.txt
>
>
> Shows up in a PR: https://github.com/apache/hadoop-ozone/runs/540133363
> {code:title=log}
> [INFO] Running 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 49.766 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
> [ERROR] 
> testReadStateMachineFailureClosesPipeline(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead)
>   Time elapsed: 49.623 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead.testReadStateMachineFailureClosesPipeline(TestContainerStateMachineFailureOnRead.java:204)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> {code:title=Location of NPE at 
> TestContainerStateMachineFailureOnRead.java:204}
> // delete the container dir from leader
> FileUtil.fullyDelete(new File(
> leaderDn.get().getDatanodeStateMachine()
> .getContainer().getContainerSet()
> 
> .getContainer(omKeyLocationInfo.getContainerID()).getContainerData() <-- this 
> line
> .getContainerPath()));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1049: HDDS-3662 Decouple finalizeAndDestroyPipeline.

2020-07-09 Thread GitBox


timmylicheng commented on a change in pull request #1049:
URL: https://github.com/apache/hadoop-ozone/pull/1049#discussion_r452602085



##
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##
@@ -410,18 +399,29 @@ public void scrubPipeline(ReplicationType type, 
ReplicationFactor factor)
 ScmConfigKeys.OZONE_SCM_PIPELINE_ALLOCATED_TIMEOUT,
 ScmConfigKeys.OZONE_SCM_PIPELINE_ALLOCATED_TIMEOUT_DEFAULT,
 TimeUnit.MILLISECONDS);
-List needToSrubPipelines = stateManager.getPipelines(type, 
factor,
-Pipeline.PipelineState.ALLOCATED).stream()
-.filter(p -> currentTime.toEpochMilli() - p.getCreationTimestamp()
-.toEpochMilli() >= pipelineScrubTimeoutInMills)
-.collect(Collectors.toList());
-for (Pipeline p : needToSrubPipelines) {
-  LOG.info("Scrubbing pipeline: id: " + p.getId().toString() +
-  " since it stays at ALLOCATED stage for " +
-  Duration.between(currentTime, p.getCreationTimestamp()).toMinutes() +
-  " mins.");
-  finalizeAndDestroyPipeline(p, false);
+
+List candidates = stateManager.getPipelines(type, factor);
+
+for (Pipeline p : candidates) {
+  // scrub pipelines who stay ALLOCATED for too long.
+  if (p.getPipelineState() == Pipeline.PipelineState.ALLOCATED &&
+  (currentTime.toEpochMilli() - p.getCreationTimestamp()
+  .toEpochMilli() >= pipelineScrubTimeoutInMills)) {
+LOG.info("Scrubbing pipeline: id: " + p.getId().toString() +
+" since it stays at ALLOCATED stage for " +
+Duration.between(currentTime, p.getCreationTimestamp())
+.toMinutes() + " mins.");
+closePipeline(p, false);
+  }
+  // scrub pipelines who stay CLOSED for too long.
+  if (p.getPipelineState() == Pipeline.PipelineState.CLOSED) {

Review comment:
   I updated with the scheduler to schedule removing pipeline after closing 
contaienrs.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3297) TestOzoneClientKeyGenerator is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3297:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestOzoneClientKeyGenerator is flaky
> 
>
> Key: HDDS-3297
> URL: https://issues.apache.org/jira/browse/HDDS-3297
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Marton Elek
>Priority: Critical
>  Labels: TriagePending, flaky-test, ozone-flaky-test
> Attachments: 
> org.apache.hadoop.ozone.freon.TestOzoneClientKeyGenerator-output.txt
>
>
> Sometimes it's hanging and stopped after a timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3338) TestOzoneManagerRocksDBLogging.shutdown times out

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3338:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestOzoneManagerRocksDBLogging.shutdown times out
> -
>
> Key: HDDS-3338
> URL: https://issues.apache.org/jira/browse/HDDS-3338
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Priority: Major
>  Labels: TriagePending
>
> Failed in it-hdds-om:
> {code}
> [INFO] Running org.apache.hadoop.ozone.om.TestKeyPurging
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.933 
> s - in org.apache.hadoop.ozone.om.TestKeyPurging
> [INFO] Running org.apache.hadoop.ozone.om.TestOzoneManagerRocksDBLogging
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 60.44 
> s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestOzoneManagerRocksDBLogging
> [ERROR] 
> testOMRocksDBLoggingEnabled(org.apache.hadoop.ozone.om.TestOzoneManagerRocksDBLogging)
>   Time elapsed: 60.023 s  <<< ERROR!
> java.lang.Exception: test timed out after 6 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at 
> java.util.concurrent.ForkJoinTask.externalAwaitDone(ForkJoinTask.java:334)
>   at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405)
>   at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
>   at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:650)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:463)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:404)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:391)
>   at 
> org.apache.hadoop.ozone.om.TestOzoneManagerRocksDBLogging.shutdown(TestOzoneManagerRocksDBLogging.java:63)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}
> {code}
> [INFO] Results:
> [INFO] 
> [ERROR] Errors: 
> [ERROR]   TestOzoneManagerRocksDBLogging.shutdown:63->Object.wait:-2 ?  test 
> timed out a...
> [INFO] 
> [ERROR] Tests run: 62, Failures: 0, Errors: 1, Skipped: 12
> {code}
> Output is FLOODED with the same message:
> {code:title=org.apache.hadoop.ozone.om.TestOzoneManagerRocksDBLogging-output.txt}
> 2020-04-01 20:52:54,072 [Datanode State Machine Thread - 0] ERROR 
> statemachine.DatanodeStateMachine (DatanodeStateMachine.java:start(232)) - 
> Unable to finish the execution.
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@24ae1cc8 
> rejected from 
> org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@24ed6a81[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 35]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
>   at 
> org.apache.hadoop.ozone.container.common.states.datanode.RunningDatanodeState.execute(RunningDatanodeState.java:144)
>   at 
> org.apache.hadoop.ozone.container.common.statemachine.StateContext.execute(StateContext.java:419)
>   at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:219)
>   at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:386)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Artifact: 
> https://github.com/apache/hadoop-ozone/suites/565485547/artifacts/3737405
> Could be related to HDDS-2866 ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3358:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Improvement)

> Intermittent test failure related to a race conditon during PipelineManager 
> close
> -
>
> Key: HDDS-3358
> URL: https://issues.apache.org/jira/browse/HDDS-3358
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: TriagePending, flaky-test, ozone-flaky-test
> Attachments: 
> org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt
>
>
> The test which is failed:
> TestSCMNodeManager
> The end of the log is:
> {code}
> 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor 
> (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution 
> message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: 
> localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380)
> at 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63)
> at 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-04-08 10:49:44,544 INFO  node.StaleNodeHandler 
> (StaleNodeHandler.java:onMessage(58)) - Datanode 
> 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: 
> localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} 
> moved to stale state. Finalizing its pipelines 
> [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, 
> PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d]
> 2020-04-08 10:49:44,544 INFO  pipeline.SCMPipelineManager 
> (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying 
> pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: 
> 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: 
> localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, 
> Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, 
> CreationTimestamp2020-04-08T10:49:37.441Z]
> 2020-04-08 10:49:44,544 INFO  pipeline.PipelineStateManager 
> (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: 
> fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: 
> 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: 
> localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, 
> Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, 
> CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state
> 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor 
> (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution 
> message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: 
> localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380)
> at 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63)
> at 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-04-08 10:49:44,544 INFO  pipeline.RatisPipelineProvider 
> (RatisPipelineProvider.java:lambda$close$4(208)) - Send 
> pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to 
> datanode 30a24b04-1289-4c30-a28a-034edfe29e3d
> 2020-04-08 10:49:44,545 WARN  events.EventQueue 
> (EventQueue.java:fireEvent(151)) - Processing of 
> TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is 
> skipped, EventQueue is not running
> 2020-04-08 10:49:44,544 INFO  node.StaleNodeHandler 
> (StaleNodeHandler.java:onMessage(58)) - Datanode 
> 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.17, host: 
> 

[GitHub] [hadoop-ozone] iamabug commented on pull request #1184: HDDS-2767. security/SecuringTDE.md

2020-07-09 Thread GitBox


iamabug commented on pull request #1184:
URL: https://github.com/apache/hadoop-ozone/pull/1184#issuecomment-656459321


   @cxorm @smengcl @xiaoyuyao Please help review this if available, thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3363) Intermittent failure in testContainerImportExport

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3363:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in testContainerImportExport
> -
>
> Key: HDDS-3363
> URL: https://issues.apache.org/jira/browse/HDDS-3363
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>
> {code:title=https://github.com/adoroszlai/hadoop-ozone/runs/571992849}
> 2020-04-08T20:30:49.0510599Z [ERROR] Tests run: 22, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 3.669 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer
> 2020-04-08T20:30:49.0535678Z [ERROR] 
> testContainerImportExport[1](org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer)
>   Time elapsed: 0.079 s  <<< ERROR!
> 2020-04-08T20:30:49.0552584Z java.io.IOException: request to write '4096' 
> bytes exceeds size in header of '19906' bytes for entry 'db/LOG'
> 2020-04-08T20:30:49.0572746Z  at 
> org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.write(TarArchiveOutputStream.java:385)
> 2020-04-08T20:30:49.0572897Z  at 
> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2147)
> 2020-04-08T20:30:49.0582579Z  at 
> org.apache.commons.io.IOUtils.copy(IOUtils.java:2102)
> 2020-04-08T20:30:49.0593659Z  at 
> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123)
> 2020-04-08T20:30:49.0603340Z  at 
> org.apache.commons.io.IOUtils.copy(IOUtils.java:2078)
> 2020-04-08T20:30:49.0613502Z  at 
> org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.includeFile(TarContainerPacker.java:225)
> 2020-04-08T20:30:49.0631425Z  at 
> org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.includePath(TarContainerPacker.java:215)
> 2020-04-08T20:30:49.0637525Z  at 
> org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.pack(TarContainerPacker.java:155)
> 2020-04-08T20:30:49.0648504Z  at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:549)
> 2020-04-08T20:30:49.0659852Z  at 
> org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer.testContainerImportExport(TestKeyValueContainer.java:233)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3506) TestOzoneFileInterfaces is flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3506:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> TestOzoneFileInterfaces is flaky
> 
>
> Key: HDDS-3506
> URL: https://issues.apache.org/jira/browse/HDDS-3506
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Marton Elek
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: TriagePending, flaky-test, ozone-flaky-test
> Attachments: 
> TEST-org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.xml
>
>
> TestOzoneFileInterfaces.testOzoneManagerLocatedFileStatusBlockOffsetsWithMultiBlockFile
>  is flaky and failed multiple times on master:
> {code}
> ./2020/04/24/822/it-filesystem/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.txt
> ./2020/04/24/822/it-filesystem/hadoop-ozone/integration-test/TEST-org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.xml
> ./2020/04/24/822/it-filesystem/output.log
> ./2020/04/27/830/it-filesystem/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.txt
> ./2020/04/27/830/it-filesystem/hadoop-ozone/integration-test/TEST-org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.xml
> ./2020/04/27/830/it-filesystem/output.log
> ./2020/04/28/831/it-filesystem/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.txt
> ./2020/04/28/831/it-filesystem/hadoop-ozone/integration-test/TEST-org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.xml
> ./2020/04/28/831/it-filesystem/output.log
> ./2020/04/28/833/it-filesystem/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.txt
> ./2020/04/28/833/it-filesystem/hadoop-ozone/integration-test/TEST-org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces.xml
> ./2020/04/28/833/it-filesystem/output.log
> {code}
> I am disabling it until the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3762:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in TestDeleteWithSlowFollower
> --
>
> Key: HDDS-3762
> URL: https://issues.apache.org/jira/browse/HDDS-3762
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.6.0
>Reporter: Attila Doroszlai
>Priority: Major
>
> TestDeleteWithSlowFollower failed soon after it was re-enabled in HDDS-3330.
> {code:title=https://github.com/apache/hadoop-ozone/runs/753363338}
> [INFO] Running org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.647 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower
> [ERROR] 
> testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower)
>   Time elapsed: 0.163 s  <<< FAILURE!
> java.lang.AssertionError
>   ...
>   at org.junit.Assert.assertNotNull(Assert.java:631)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225)
> {code}
> CC [~shashikant] [~elek]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3516) Intermittent failure in TestReadRetries

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3516:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in TestReadRetries
> ---
>
> Key: HDDS-3516
> URL: https://issues.apache.org/jira/browse/HDDS-3516
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>  Labels: TriagePending, flaky-test, ozone-flaky-test
>
> {code:title=https://github.com/apache/hadoop-ozone/runs/626422906}
> ---
> Test set: org.apache.hadoop.ozone.client.rpc.TestReadRetries
> ---
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 56.35 s <<< 
> FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestReadRetries
> testPutKeyAndGetKeyThreeNodes(org.apache.hadoop.ozone.client.rpc.TestReadRetries)
>   Time elapsed: 10.265 s  <<< FAILURE!
> java.lang.AssertionError
>   ...
>   at 
> org.apache.hadoop.ozone.client.rpc.TestReadRetries.testPutKeyAndGetKeyThreeNodes(TestReadRetries.java:181)
> {code}
> https://github.com/elek/ozone-build-results/tree/master/2020/04/28/836/it-client



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2767) security/SecuringTDE.md

2020-07-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2767:
-
Labels: pull-request-available  (was: )

> security/SecuringTDE.md
> ---
>
> Key: HDDS-2767
> URL: https://issues.apache.org/jira/browse/HDDS-2767
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiang Zhang
>Assignee: Xiang Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] iamabug opened a new pull request #1184: HDDS-2767. security/SecuringTDE.md

2020-07-09 Thread GitBox


iamabug opened a new pull request #1184:
URL: https://github.com/apache/hadoop-ozone/pull/1184


   ## What changes were proposed in this pull request?
   
   translation to 
https://hadoop.apache.org/ozone/docs/0.5.0-beta/security/securingtde.html
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2767
   
   ## How was this patch tested?
   
   hugo server 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3874) ITestRootedOzoneContract tests are flaky

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3874:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> ITestRootedOzoneContract tests are flaky
> 
>
> Key: HDDS-3874
> URL: https://issues.apache.org/jira/browse/HDDS-3874
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Marton Elek
>Assignee: Siyao Meng
>Priority: Blocker
>
> Different tests are failed with similar reasons:
> {code}
> java.lang.Exception: test timed out after 18 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
>   at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
>   at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:537)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:499)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:514)
>   at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:149)
>   at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleStreamAction(KeyOutputStream.java:483)
>   at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:457)
>   at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:510)
>   at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at 
> org.apache.hadoop.fs.contract.ContractTestUtils.createFile(ContractTestUtils.java:638)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenFileTwice(AbstractContractOpenTest.java:135)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> Example:
> https://github.com/elek/ozone-build-results/blob/master/2020/06/16/1051/it-filesystem-contract/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.contract.rooted.ITestRootedOzoneContractOpen.txt
> But same problem here:
> https://github.com/elek/hadoop-ozone/runs/810175295?check_suite_focus=true 
> (contract)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3873) Intermittent failure in Recon acceptance test due to too many pipelines

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3873:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Intermittent failure in Recon acceptance test due to too many pipelines
> ---
>
> Key: HDDS-3873
> URL: https://issues.apache.org/jira/browse/HDDS-3873
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>
> Recon API acceptance test has too strict check on number of pipelines:
> {code}
> Check if Recon picks up DN heartbeats | FAIL |
> ...
> {"pipelines":5,...' does not contain '"pipelines":4'
> {code}
> https://github.com/apache/hadoop-ozone/pull/1050/checks?check_run_id=810139657



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3938) Flaky TestWatchForCommit#test2WayCommitForTimeoutException

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3938:
-
Parent: HDDS-1127
Issue Type: Sub-task  (was: Bug)

> Flaky TestWatchForCommit#test2WayCommitForTimeoutException
> --
>
> Key: HDDS-3938
> URL: https://issues.apache.org/jira/browse/HDDS-3938
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.6.0
>Reporter: Siyao Meng
>Priority: Major
>
> In PR#1255 
> https://github.com/apache/hadoop-ozone/runs/813994346?check_suite_focus=true:
> {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/06/27/1255/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.GroupMismatchException: 
> a498c7dc-27d9-4ae8-a233-895baee1c3ae: group-C4714E1CC0B9 not found.
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:351)
> {code}
> In PR#1459 
> https://github.com/apache/hadoop-ozone/runs/844177861?check_suite_focus=true:
> {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/07/07/1459/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.GroupMismatchException: 
> a7b8b74b-f98f-42e2-9f4c-7068bd51e221: group-DCED9E4CDB5B not found.
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:348)
> {code}
> And there are two more instances that can be found in 
> https://elek.github.io/ozone-build-results/.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3938) Flaky TestWatchForCommit#test2WayCommitForTimeoutException

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3938:
-
Labels: 0.7.0  (was: )

> Flaky TestWatchForCommit#test2WayCommitForTimeoutException
> --
>
> Key: HDDS-3938
> URL: https://issues.apache.org/jira/browse/HDDS-3938
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.6.0
>Reporter: Siyao Meng
>Priority: Major
>  Labels: 0.7.0
>
> In PR#1255 
> https://github.com/apache/hadoop-ozone/runs/813994346?check_suite_focus=true:
> {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/06/27/1255/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.GroupMismatchException: 
> a498c7dc-27d9-4ae8-a233-895baee1c3ae: group-C4714E1CC0B9 not found.
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:351)
> {code}
> In PR#1459 
> https://github.com/apache/hadoop-ozone/runs/844177861?check_suite_focus=true:
> {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/07/07/1459/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.GroupMismatchException: 
> a7b8b74b-f98f-42e2-9f4c-7068bd51e221: group-DCED9E4CDB5B not found.
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:348)
> {code}
> And there are two more instances that can be found in 
> https://elek.github.io/ozone-build-results/.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3949) Flaky test cases

2020-07-09 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen resolved HDDS-3949.
--
Resolution: Duplicate

> Flaky test cases
> 
>
> Key: HDDS-3949
> URL: https://issues.apache.org/jira/browse/HDDS-3949
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Sammi Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3949) Flaky test cases

2020-07-09 Thread Sammi Chen (Jira)
Sammi Chen created HDDS-3949:


 Summary: Flaky test cases
 Key: HDDS-3949
 URL: https://issues.apache.org/jira/browse/HDDS-3949
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
Reporter: Sammi Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3912) Change SCM ContainerDB key to proto structure to support backward compatibility.

2020-07-09 Thread Sammi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155061#comment-17155061
 ] 

Sammi Chen commented on HDDS-3912:
--

Hi [~pifta] and [~avijayan], do we need to put this into 0.6.0 to fix the 
compatibility issue ASAP? 

> Change SCM ContainerDB key to proto structure to support backward 
> compatibility.
> 
>
> Key: HDDS-3912
> URL: https://issues.apache.org/jira/browse/HDDS-3912
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Aravindan Vijayan
>Assignee: Istvan Fajth
>Priority: Major
>
> Currently, the 'key' type of the SCM container DB is 
> org.apache.hadoop.hdds.scm.container.ContainerID which is not backed up by a 
> proto equivalent. Hence, we use a long codec to serialize and deserialize the 
> key from long to byte[ ] and back. 
> {code}
>   public static final DBColumnFamilyDefinition
>   CONTAINERS =
>   new DBColumnFamilyDefinition(
>   "containers",
>   ContainerID.class,
>   new ContainerIDCodec(),
>   ContainerInfo.class,
>   new ContainerInfoCodec());
> {code}
> In the future if we have to support a container id type that is more than 
> just a long, then changing the ContainerID class will break backward 
> compatibility. To handle this incompatibility in the future, we have to 
> either migrate old data or provide fallback conversion codecs for the old 
> data type. Hence, it is good to wrap this long into a proto structure. 
> cc [~nanda619]  / [~arp].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3916) Change the return type in DeleteKeys and renameKeys in om client side.

2020-07-09 Thread Sammi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155059#comment-17155059
 ] 

Sammi Chen commented on HDDS-3916:
--

Hi [~micahzhao],will this break the API level backword compatibility? 

> Change the return type in DeleteKeys and renameKeys in om client side.
> --
>
> Key: HDDS-3916
> URL: https://issues.apache.org/jira/browse/HDDS-3916
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: mingchao zhao
>Priority: Major
>
> Currently all delete and rename operations in OzoneBucket.java are void. 
>  We had putted the List of unDeletedKeys and unRenamedKeys into Response. We 
> also need to change the return type in client side。Make sure the user can get 
> collection of keys for unsuccessful operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3933) Fix memory leak because of too many Datanode State Machine Thread

2020-07-09 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3933:
-
Summary: Fix memory leak because of too many Datanode State Machine Thread  
(was: memory leak because of too many Datanode State Machine Thread)

> Fix memory leak because of too many Datanode State Machine Thread
> -
>
> Key: HDDS-3933
> URL: https://issues.apache.org/jira/browse/HDDS-3933
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: jstack.txt, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png
>
>
> When create 22345th  Datanode State Machine Thread, OOM happened.
> !screenshot-1.png! 
>  !screenshot-2.png! 
>  !screenshot-3.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

2020-07-09 Thread GitBox


ChenSammi commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-656443918


   testDeleteKeyWithSlowFollower failed at leader membership check step. The 
test passed locally. It seems a timing issue, not relevant to this patch. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1182: HDDS-3926. OM Token Identifier table should use in-house serialization.

2020-07-09 Thread GitBox


prashantpogde commented on a change in pull request #1182:
URL: https://github.com/apache/hadoop-ozone/pull/1182#discussion_r452573445



##
File path: hadoop-hdds/interface-client/src/main/proto/proto.lock
##
@@ -1476,6 +1476,21 @@
   }
 ],
 "messages": [
+  {

Review comment:
   Integration test failure seems unrelated to the changes here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1182: HDDS-3926. OM Token Identifier table should use in-house serialization.

2020-07-09 Thread GitBox


prashantpogde commented on a change in pull request #1182:
URL: https://github.com/apache/hadoop-ozone/pull/1182#discussion_r452573311



##
File path: hadoop-hdds/interface-client/src/main/proto/proto.lock
##
@@ -1476,6 +1476,21 @@
   }
 ],
 "messages": [
+  {

Review comment:
   It was auto generated. I did not modify anything here in this file.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] iamabug commented on a change in pull request #1175: HDDS-2766. security/SecuringDataNodes.md

2020-07-09 Thread GitBox


iamabug commented on a change in pull request #1175:
URL: https://github.com/apache/hadoop-ozone/pull/1175#discussion_r452569796



##
File path: hadoop-hdds/docs/content/security/SecuringDatanodes.zh.md
##
@@ -0,0 +1,53 @@
+---
+title: "安全化 Datanode"
+date: "2019-April-03"
+weight: 2
+summary:  解释安全化 datanode 的不同模式,包括 Kerberos、证书的手动颁发和自动颁发等。
+icon: th
+---
+
+
+
+过去,Hadoop 中 datanode 的安全机制是通过在节点上创建 Keytab 文件实现的,而 Ozone 改用 datanode 证书,在安全的 
Ozone 集群中,datanode 不再需要 Kerberos。
+
+但是我们也支持传统的基于 Kerberos 的认证来方便现有用户,用户只需要在 hdfs-site.xml 里配置下面参数即可:
+
+参数名|描述
+|--
+dfs.datanode.kerberos.principal| datanode 的服务主体名  比如:dn/_h...@realm.com
+dfs.datanode.keytab.file| datanode 进程所使用的 keytab 文件
+hdds.datanode.http.kerberos.principal| datanode http 服务器的服务主体名

Review comment:
   Thanks for the link and commit. @smengcl 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452561831



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeysDeleteRequest.java
##
@@ -116,93 +111,116 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
 OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
 getOmRequest());
 OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+// As right now, only client exposed API is for a single volume and
+// bucket. So, all entries will have same volume name and bucket name.
+// So, we can validate once.
+if (deleteKeyArgsList.size() > 0) {
+  volumeName = deleteKeyArgsList.get(0).getVolumeName();
+  bucketName = deleteKeyArgsList.get(0).getBucketName();
+}
+
+boolean acquiredLock =
+omMetadataManager.getLock().acquireWriteLock(BUCKET_LOCK, volumeName,
+bucketName);
+
+int indexFailed = 0;
 try {
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
+
+  // Validate bucket and volume exists or not.
+  if (deleteKeyArgsList.size() > 0) {
+validateBucketAndVolume(omMetadataManager, volumeName, bucketName);
+  }
+
+
+  // Check if any of the key in the batch cannot be deleted. If exists the
+  // batch delete will be failed.
+
+  for (indexFailed = 0; indexFailed < deleteKeyArgsList.size();
+   indexFailed++) {
+KeyArgs deleteKeyArgs = deleteKeyArgsList.get(indexFailed);
+auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
 volumeName = deleteKeyArgs.getVolumeName();
 bucketName = deleteKeyArgs.getBucketName();
 keyName = deleteKeyArgs.getKeyName();
 String objectKey = omMetadataManager.getOzoneKey(volumeName, 
bucketName,
 keyName);
 OmKeyInfo omKeyInfo = omMetadataManager.getKeyTable().get(objectKey);
-omKeyInfoList.add(omKeyInfo);
-unDeletedKeys.add(omKeyInfo);
-  }
 
-  // Check if any of the key in the batch cannot be deleted. If exists the
-  // batch will delete failed.
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
-volumeName = deleteKeyArgs.getVolumeName();
-bucketName = deleteKeyArgs.getBucketName();
-keyName = deleteKeyArgs.getKeyName();
-auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
+
+// Do we need to fail the batch if one of the key does not exist?
+// For now following the previous code behavior. If this code changes
+// behavior, this will be incompatible change across upgrades, and we
+// need to version the Requests and do logic accordingly.
+
+if (omKeyInfo == null) {
+  LOG.error("Key does not exist {}", objectKey);
+  throw new OMException("Key Not Found " + objectKey, KEY_NOT_FOUND);
+}
+
 // check Acl
 checkKeyAcls(ozoneManager, volumeName, bucketName, keyName,

Review comment:
   Done

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMKeysDeleteResponse.java
##
@@ -69,65 +64,36 @@ public OMKeysDeleteResponse(@Nonnull OMResponse omResponse) 
{
   public void addToDBBatch(OMMetadataManager omMetadataManager,
BatchOperation batchOperation) throws IOException {
 
+String volumeName = "";
+String bucketName = "";
+String keyName = "";
 for (OmKeyInfo omKeyInfo : omKeyInfoList) {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452561712



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeysDeleteRequest.java
##
@@ -116,89 +111,112 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
 OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
 getOmRequest());
 OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+// As right now, only client exposed API is for a single volume and
+// bucket. So, all entries will have same volume name and bucket name.
+// So, we can validate once.
+if (deleteKeyArgsList.size() > 0) {
+  volumeName = deleteKeyArgsList.get(0).getVolumeName();
+  bucketName = deleteKeyArgsList.get(0).getBucketName();
+}
+
+boolean acquiredLock =
+omMetadataManager.getLock().acquireWriteLock(BUCKET_LOCK, volumeName,
+bucketName);
+
+int indexFailed = 0;
 try {
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
+
+  // Validate bucket and volume exists or not.
+  if (deleteKeyArgsList.size() > 0) {
+validateBucketAndVolume(omMetadataManager, volumeName, bucketName);
+  }
+
+
+  // Check if any of the key in the batch cannot be deleted. If exists the
+  // batch delete will be failed.
+
+  for (indexFailed = 0; indexFailed < deleteKeyArgsList.size();
+   indexFailed++) {
+KeyArgs deleteKeyArgs = deleteKeyArgsList.get(0);
+auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
 volumeName = deleteKeyArgs.getVolumeName();
 bucketName = deleteKeyArgs.getBucketName();
 keyName = deleteKeyArgs.getKeyName();
 String objectKey = omMetadataManager.getOzoneKey(volumeName, 
bucketName,
 keyName);
 OmKeyInfo omKeyInfo = omMetadataManager.getKeyTable().get(objectKey);
-omKeyInfoList.add(omKeyInfo);
-unDeletedKeys.add(omKeyInfo);
-  }
 
-  // Check if any of the key in the batch cannot be deleted. If exists the
-  // batch will delete failed.
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
-volumeName = deleteKeyArgs.getVolumeName();
-bucketName = deleteKeyArgs.getBucketName();
-keyName = deleteKeyArgs.getKeyName();
-auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
+
+// Do we need to fail the batch if one of the key does not exist?
+// For now following the previous code behavior. If this code changes
+// behavior, this will be incompatible change across upgrades, and we
+// need to version the Requests and do logic accordingly.
+
+if (omKeyInfo == null) {
+  LOG.error("Key does not exist {}", objectKey);
+  throw new OMException("Key Not Found " + objectKey, KEY_NOT_FOUND);
+}
+
 // check Acl
 checkKeyAcls(ozoneManager, volumeName, bucketName, keyName,
 IAccessAuthorizer.ACLType.DELETE, OzoneObj.ResourceType.KEY);
 
-String objectKey = omMetadataManager.getOzoneKey(
-volumeName, bucketName, keyName);
-
-// Validate bucket and volume exists or not.
-validateBucketAndVolume(omMetadataManager, volumeName, bucketName);
-
-OmKeyInfo omKeyInfo = omMetadataManager.getKeyTable().get(objectKey);
+omKeyInfoList.add(omKeyInfo);
+  }
 
-if (omKeyInfo == null) {
-  throw new OMException("Key not found: " + keyName, KEY_NOT_FOUND);
-}
 
-// Check if this transaction is a replay of ratis logs.
-if (isReplay(ozoneManager, omKeyInfo, trxnLogIndex)) {
-  // Replay implies the response has already been returned to
-  // the client. So take no further action and return a dummy
-  // OMClientResponse.
-  throw new OMReplayException();
-}
+  // Mark all keys in cache as deleted.
+  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
+volumeName = deleteKeyArgs.getVolumeName();
+bucketName = deleteKeyArgs.getBucketName();
+keyName = deleteKeyArgs.getKeyName();
+omMetadataManager.getKeyTable().addCacheEntry(
+new CacheKey<>(omMetadataManager.getOzoneKey(volumeName, 
bucketName,
+keyName)),
+new CacheValue<>(Optional.absent(), trxnLogIndex));
   }
 
+
   omClientResponse = new OMKeysDeleteResponse(omResponse
-  .setDeleteKeysResponse(DeleteKeysResponse.newBuilder()).build(),
-  omKeyInfoList, trxnLogIndex, ozoneManager.isRatisEnabled());
+  .setDeleteKeysResponse(DeleteKeysResponse.newBuilder()
+  .setStatus(true)).build(), omKeyInfoList, trxnLogIndex,
+  ozoneManager.isRatisEnabled());
   result = Result.SUCCESS;
+
 } catch (IOException ex) {
-  if (ex instanceof OMReplayException) {
-

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452561797



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeysDeleteRequest.java
##
@@ -116,93 +111,116 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
 OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
 getOmRequest());
 OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+// As right now, only client exposed API is for a single volume and
+// bucket. So, all entries will have same volume name and bucket name.
+// So, we can validate once.
+if (deleteKeyArgsList.size() > 0) {
+  volumeName = deleteKeyArgsList.get(0).getVolumeName();
+  bucketName = deleteKeyArgsList.get(0).getBucketName();
+}
+
+boolean acquiredLock =
+omMetadataManager.getLock().acquireWriteLock(BUCKET_LOCK, volumeName,
+bucketName);
+
+int indexFailed = 0;
 try {
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
+
+  // Validate bucket and volume exists or not.
+  if (deleteKeyArgsList.size() > 0) {
+validateBucketAndVolume(omMetadataManager, volumeName, bucketName);
+  }
+
+
+  // Check if any of the key in the batch cannot be deleted. If exists the
+  // batch delete will be failed.
+
+  for (indexFailed = 0; indexFailed < deleteKeyArgsList.size();
+   indexFailed++) {
+KeyArgs deleteKeyArgs = deleteKeyArgsList.get(indexFailed);
+auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
 volumeName = deleteKeyArgs.getVolumeName();
 bucketName = deleteKeyArgs.getBucketName();
 keyName = deleteKeyArgs.getKeyName();
 String objectKey = omMetadataManager.getOzoneKey(volumeName, 
bucketName,
 keyName);
 OmKeyInfo omKeyInfo = omMetadataManager.getKeyTable().get(objectKey);
-omKeyInfoList.add(omKeyInfo);
-unDeletedKeys.add(omKeyInfo);
-  }
 
-  // Check if any of the key in the batch cannot be deleted. If exists the
-  // batch will delete failed.
-  for (KeyArgs deleteKeyArgs : deleteKeyArgsList) {
-volumeName = deleteKeyArgs.getVolumeName();
-bucketName = deleteKeyArgs.getBucketName();
-keyName = deleteKeyArgs.getKeyName();
-auditMap = buildKeyArgsAuditMap(deleteKeyArgs);
+
+// Do we need to fail the batch if one of the key does not exist?
+// For now following the previous code behavior. If this code changes
+// behavior, this will be incompatible change across upgrades, and we
+// need to version the Requests and do logic accordingly.
+
+if (omKeyInfo == null) {

Review comment:
   Done. For now, made it simple returning unDeletedKeys.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3948) Release/reacquire lock for OMKeysDeleteRequest

2020-07-09 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-3948:


 Summary: Release/reacquire lock for OMKeysDeleteRequest
 Key: HDDS-3948
 URL: https://issues.apache.org/jira/browse/HDDS-3948
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


This is created based on the comment from @xiaoyu during review of PR #1169

https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452474846

Also holding the bucket lock once for all may not be a good idea to large 
delete as that will prevent other clients operation on the same bucket. grab 
and release individual or in small batches give other clients opportunities to 
access buckets.

This might be helpful for non-HA without ratis enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452560417



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -867,10 +867,10 @@ message DeletedKeys {
 }
 
 message DeleteKeysResponse {
-repeated KeyInfo deletedKeys = 1;
-repeated KeyInfo unDeletedKeys = 2;

Review comment:
   Created https://issues.apache.org/jira/browse/HDDS-3948 for this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1169: HDDS-3930. Fix OMKeyDeletesRequest.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1169:
URL: https://github.com/apache/hadoop-ozone/pull/1169#discussion_r452559759



##
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##
@@ -867,10 +867,10 @@ message DeletedKeys {
 }
 
 message DeleteKeysResponse {
-repeated KeyInfo deletedKeys = 1;
-repeated KeyInfo unDeletedKeys = 2;

Review comment:
   As in HA, and when ratis is enabled as default for non-HA, it is a 
single thread executor now. So, this optimization is not really required. For 
non-HA, it might help, I will open a new Jira for this improvement.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3944) OM StateMachine unpause fails with NPE

2020-07-09 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-3944.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

> OM StateMachine unpause fails with NPE
> --
>
> Key: HDDS-3944
> URL: https://issues.apache.org/jira/browse/HDDS-3944
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: OM HA
>Affects Versions: 0.6.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> Noticed this NPE in OM logs for OM HA [acceptance 
> test|https://github.com/apache/hadoop-ozone/pull/1173/checks?check_run_id=847204159]:
> {code}
> 2020-07-07 20:54:23 WARN  RaftServerImpl:1247 - om2@group-D66704EFC61C: 
> Failed to notify StateMachine to InstallSnapshot. Exception: 
> java.lang.NullPointerException: When ratis is enabled indexToTerm should not 
> be null
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1183: HDDS-3944. OM StateMachine unpause fails with NPE

2020-07-09 Thread GitBox


bharatviswa504 commented on pull request #1183:
URL: https://github.com/apache/hadoop-ozone/pull/1183#issuecomment-656398721


   Thank You @adoroszlai for the contribution and @hanishakoneru for the review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



  1   2   >