date:20190725

[jira] [Work logged] (HDDS-1786) Datanodes takeSnapshot should delete previously created snapshots

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1786?focusedWorklogId=283131=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283131
 ]

ASF GitHub Bot logged work on HDDS-1786:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:53
Start Date: 26/Jul/19 05:53
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1163: 
HDDS-1786 : Datanodes takeSnapshot should delete previously created s…
URL: https://github.com/apache/hadoop/pull/1163#discussion_r307597008
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestContainerStateMachineInt.java
 ##
 @@ -53,7 +53,7 @@
  * Tests the containerStateMachine failure handling.
  */
 
-public class TestContainerStateMachine {
+public class TestContainerStateMachineInt {
 
 Review comment:
   Why is this change done. Any reason for this changing className?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283131)
Time Spent: 1h 20m  (was: 1h 10m)

> Datanodes takeSnapshot should delete previously created snapshots
> -
>
> Key: HDDS-1786
> URL: https://issues.apache.org/jira/browse/HDDS-1786
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Right now, after after taking a new snapshot, the previous snapshot file is 
> left in the raft log directory. When a new snapshot is taken, the previous 
> snapshots should be deleted.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1786) Datanodes takeSnapshot should delete previously created snapshots

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1786?focusedWorklogId=283130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283130
 ]

ASF GitHub Bot logged work on HDDS-1786:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:52
Start Date: 26/Jul/19 05:52
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1163: 
HDDS-1786 : Datanodes takeSnapshot should delete previously created s…
URL: https://github.com/apache/hadoop/pull/1163#discussion_r307597008
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestContainerStateMachineInt.java
 ##
 @@ -53,7 +53,7 @@
  * Tests the containerStateMachine failure handling.
  */
 
-public class TestContainerStateMachine {
+public class TestContainerStateMachineInt {
 
 Review comment:
   Why is this change done. Any reason for this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283130)
Time Spent: 1h 10m  (was: 1h)

> Datanodes takeSnapshot should delete previously created snapshots
> -
>
> Key: HDDS-1786
> URL: https://issues.apache.org/jira/browse/HDDS-1786
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Right now, after after taking a new snapshot, the previous snapshot file is 
> left in the raft log directory. When a new snapshot is taken, the previous 
> snapshots should be deleted.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1864) Turn on topology aware read in TestFailureHandlingByClient

2019-07-25 Thread Mukul Kumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1864:

Status: Patch Available  (was: Open)

> Turn on topology aware read in TestFailureHandlingByClient
> --
>
> Key: HDDS-1864
> URL: https://issues.apache.org/jira/browse/HDDS-1864
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1619) Support volume addACL operations for OM HA.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1619?focusedWorklogId=283118=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283118
 ]

ASF GitHub Bot logged work on HDDS-1619:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:36
Start Date: 26/Jul/19 05:36
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1147: 
HDDS-1619. Support volume addACL operations for OM HA. Contributed by…
URL: https://github.com/apache/hadoop/pull/1147#discussion_r307592917
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/acl/OMVolumeAddAclRequest.java
 ##
 @@ -0,0 +1,126 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.om.request.volume.acl;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.request.OMClientRequest;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.volume.OMVolumeAddAclResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status;
+
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.apache.hadoop.ozone.security.acl.OzoneObjInfo;
+import org.apache.hadoop.utils.db.cache.CacheKey;
+import org.apache.hadoop.utils.db.cache.CacheValue;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.VOLUME_LOCK;
+
+/**
+ * Handles add acl request.
+ */
+public class OMVolumeAddAclRequest extends OMClientRequest {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMVolumeAddAclRequest.class);
+
+  public OMVolumeAddAclRequest(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long transactionLogIndex) {
+AddAclRequest addAclRequest = getOmRequest().getAddAclRequest();
+Preconditions.checkNotNull(addAclRequest);
+
+OMResponse.Builder omResponse = OMResponse.newBuilder().setCmdType(
+OzoneManagerProtocolProtos.Type.AddAcl).setStatus(
+Status.OK).setSuccess(true);
+
+if (!addAclRequest.hasAcl()) {
+  omResponse.setStatus(Status.INVALID_REQUEST).setSuccess(false);
+  return new OMVolumeAddAclResponse(null, omResponse.build());
+}
+
+OzoneObjInfo ozoneObj = OzoneObjInfo.fromProtobuf(addAclRequest.getObj());
 
 Review comment:
   ozoneObj is used to getVolumeName , we can directly getVolumeName from 
addAclRequest.getObj().getPath(), because we already checked if resType is 
Volume, then only we shall be in this place.
   
   In similar way, we can avoid conversion for  OzoneAcl ozoneAcl = 
OzoneAcl.fromProtobuf(addAclRequest.getAcl());
   If OmVolumeArgs can directly store `List`. In this way we can 
avoid protobuf conversion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

[jira] [Work logged] (HDDS-1619) Support volume addACL operations for OM HA.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1619?focusedWorklogId=283117=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283117
 ]

ASF GitHub Bot logged work on HDDS-1619:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:34
Start Date: 26/Jul/19 05:34
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1147: 
HDDS-1619. Support volume addACL operations for OM HA. Contributed by…
URL: https://github.com/apache/hadoop/pull/1147#discussion_r307593872
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/acl/OMVolumeAddAclRequest.java
 ##
 @@ -0,0 +1,126 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.om.request.volume.acl;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.request.OMClientRequest;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.volume.OMVolumeAddAclResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status;
+
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.apache.hadoop.ozone.security.acl.OzoneObjInfo;
+import org.apache.hadoop.utils.db.cache.CacheKey;
+import org.apache.hadoop.utils.db.cache.CacheValue;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.VOLUME_LOCK;
+
+/**
+ * Handles add acl request.
+ */
+public class OMVolumeAddAclRequest extends OMClientRequest {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMVolumeAddAclRequest.class);
+
+  public OMVolumeAddAclRequest(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long transactionLogIndex) {
+AddAclRequest addAclRequest = getOmRequest().getAddAclRequest();
+Preconditions.checkNotNull(addAclRequest);
+
+OMResponse.Builder omResponse = OMResponse.newBuilder().setCmdType(
+OzoneManagerProtocolProtos.Type.AddAcl).setStatus(
+Status.OK).setSuccess(true);
+
+if (!addAclRequest.hasAcl()) {
+  omResponse.setStatus(Status.INVALID_REQUEST).setSuccess(false);
+  return new OMVolumeAddAclResponse(null, omResponse.build());
+}
+
+OzoneObjInfo ozoneObj = OzoneObjInfo.fromProtobuf(addAclRequest.getObj());
+OzoneAcl ozoneAcl = OzoneAcl.fromProtobuf(addAclRequest.getAcl());
+String volume = ozoneObj.getVolumeName();
+
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumVolumeUpdates();
+IOException exception = null;
+OmVolumeArgs omVolumeArgs = null;
+
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+try {
+  // check Acl
+  if (ozoneManager.getAclsEnabled()) {
 
 Review comment:
   Got it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:

[jira] [Work logged] (HDDS-1864) Turn on topology aware read in TestFailureHandlingByClient

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1864?focusedWorklogId=283115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283115
 ]

ASF GitHub Bot logged work on HDDS-1864:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:33
Start Date: 26/Jul/19 05:33
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1168: HDDS-1864. Turn 
on topology aware read in TestFailureHandlingByClient.
URL: https://github.com/apache/hadoop/pull/1168#issuecomment-515315317
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 44 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 575 | trunk passed |
   | +1 | compile | 363 | trunk passed |
   | +1 | checkstyle | 65 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 840 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 155 | trunk passed |
   | 0 | spotbugs | 424 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 618 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 554 | the patch passed |
   | +1 | compile | 355 | the patch passed |
   | +1 | javac | 355 | the patch passed |
   | +1 | checkstyle | 66 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 658 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 160 | the patch passed |
   | +1 | findbugs | 622 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 285 | hadoop-hdds in the patch failed. |
   | -1 | unit | 1794 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 7381 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures |
   |   | hadoop.ozone.om.TestScmSafeMode |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   |   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1168/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1168 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 9b7fbe1ebdcb 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / ce99cc3 |
   | Default Java | 1.8.0_212 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1168/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1168/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1168/1/testReport/ |
   | Max. process+thread count | 4923 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/integration-test U: 
hadoop-ozone/integration-test |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1168/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283115)
Time Spent: 20m  (was: 10m)

> Turn on topology aware read in TestFailureHandlingByClient
> --
>
> Key: HDDS-1864
> URL: https://issues.apache.org/jira/browse/HDDS-1864
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels:

[jira] [Commented] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893337#comment-16893337
 ] 

Hudson commented on HDDS-1861:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16989 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16989/])
HDDS-1861. Fix TableCacheImpl cleanup logic. (#1165) (github: rev 
3426777140d6dfd7bda13c23eb030fea75201307)
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/cache/TableCacheImpl.java


> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1861:
-
   Resolution: Fixed
Fix Version/s: 0.5.0
   Status: Resolved  (was: Patch Available)

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?focusedWorklogId=283114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283114
 ]

ASF GitHub Bot logged work on HDDS-1861:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:30
Start Date: 26/Jul/19 05:30
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1165: 
HDDS-1861. Fix TableCacheImpl cleanup logic.
URL: https://github.com/apache/hadoop/pull/1165
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283114)
Time Spent: 0.5h  (was: 20m)

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?focusedWorklogId=283113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283113
 ]

ASF GitHub Bot logged work on HDDS-1861:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:29
Start Date: 26/Jul/19 05:29
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #1165: HDDS-1861. Fix 
TableCacheImpl cleanup logic.
URL: https://github.com/apache/hadoop/pull/1165#issuecomment-515314662
 
 
   Thank you @arp7 for the review.
   Test failures are not related to this patch. I will commit this to trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283113)
Time Spent: 20m  (was: 10m)

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1619) Support volume addACL operations for OM HA.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1619?focusedWorklogId=283111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283111
 ]

ASF GitHub Bot logged work on HDDS-1619:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:28
Start Date: 26/Jul/19 05:28
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1147: 
HDDS-1619. Support volume addACL operations for OM HA. Contributed by…
URL: https://github.com/apache/hadoop/pull/1147#discussion_r307592917
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/acl/OMVolumeAddAclRequest.java
 ##
 @@ -0,0 +1,126 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.om.request.volume.acl;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.request.OMClientRequest;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.volume.OMVolumeAddAclResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status;
+
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.apache.hadoop.ozone.security.acl.OzoneObjInfo;
+import org.apache.hadoop.utils.db.cache.CacheKey;
+import org.apache.hadoop.utils.db.cache.CacheValue;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.VOLUME_LOCK;
+
+/**
+ * Handles add acl request.
+ */
+public class OMVolumeAddAclRequest extends OMClientRequest {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMVolumeAddAclRequest.class);
+
+  public OMVolumeAddAclRequest(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long transactionLogIndex) {
+AddAclRequest addAclRequest = getOmRequest().getAddAclRequest();
+Preconditions.checkNotNull(addAclRequest);
+
+OMResponse.Builder omResponse = OMResponse.newBuilder().setCmdType(
+OzoneManagerProtocolProtos.Type.AddAcl).setStatus(
+Status.OK).setSuccess(true);
+
+if (!addAclRequest.hasAcl()) {
+  omResponse.setStatus(Status.INVALID_REQUEST).setSuccess(false);
+  return new OMVolumeAddAclResponse(null, omResponse.build());
+}
+
+OzoneObjInfo ozoneObj = OzoneObjInfo.fromProtobuf(addAclRequest.getObj());
 
 Review comment:
   ozoneObj is used to getVolumeName , we can directly getVolumeName from 
addAclRequest.getObj().getPath(), because we already checked if resType is 
Volume, then only we shall be in this place.
   
   In similar way, we can avoid conversion for  OzoneAcl ozoneAcl = 
OzoneAcl.fromProtobuf(addAclRequest.getAcl());
   If OmVolumeArgs can directly store List. In this way we can 
avoid protobuf conversion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

[jira] [Assigned] (HDDS-1553) Add metrics in rack aware container placement policy

2019-07-25 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao reassigned HDDS-1553:


Assignee: Junjie Chen  (was: Sammi Chen)

> Add metrics in rack aware container placement policy
> 
>
> Key: HDDS-1553
> URL: https://issues.apache.org/jira/browse/HDDS-1553
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Junjie Chen
>Priority: Major
>
> To collect following statistics, 
> 1. total requested datanode count (A)
> 2. success allocated datanode count without constrain compromise (B)
> 3. success allocated datanode count with some comstrain compromise (C)
> B includes C, failed allocation = (A - B)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1855) TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1855?focusedWorklogId=283108=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283108
 ]

ASF GitHub Bot logged work on HDDS-1855:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:12
Start Date: 26/Jul/19 05:12
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on issue #1153: HDDS-1855. 
TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing.
URL: https://github.com/apache/hadoop/pull/1153#issuecomment-515311574
 
 
   @ChenSammi, the test was failing even in my local. This is because 
`CachedDNSToSwitchMapping` always uses IP address for the resolution.
   
   ```
   public List resolve(List names) {
   // normalize all input names to be in the form of IP addresses
   names = NetUtils.normalizeHostNames(names);
   
   List  result = new ArrayList(names.size());
   if (names.isEmpty()) {
 return result;
   }
   
   List uncachedHosts = getUncachedHosts(names);
   
   // Resolve the uncached hosts
   List resolvedHosts = rawMapping.resolve(uncachedHosts);
   //cache them
   cacheResolvedHosts(uncachedHosts, resolvedHosts);
   //now look up the entire list in the cache
   return getCachedHosts(names);
   
 }
   ```
   
   **Note:** // normalize all input names to be in the form of IP addresses
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283108)
Time Spent: 40m  (was: 0.5h)

> TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing
> --
>
> Key: HDDS-1855
> URL: https://issues.apache.org/jira/browse/HDDS-1855
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestStorageContainerManager#testScmProcessDatanodeHeartbeat}} is failing 
> with the following exception
> {noformat}
> [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 106.315 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.TestStorageContainerManager
> [ERROR] 
> testScmProcessDatanodeHeartbeat(org.apache.hadoop.ozone.TestStorageContainerManager)
>   Time elapsed: 21.97 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.TestStorageContainerManager.testScmProcessDatanodeHeartbeat(TestStorageContainerManager.java:531)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1619) Support volume addACL operations for OM HA.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1619?focusedWorklogId=283104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283104
 ]

ASF GitHub Bot logged work on HDDS-1619:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:03
Start Date: 26/Jul/19 05:03
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1147: HDDS-1619. 
Support volume addACL operations for OM HA. Contributed by…
URL: https://github.com/apache/hadoop/pull/1147#discussion_r307589215
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/acl/OMVolumeAddAclRequest.java
 ##
 @@ -0,0 +1,126 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.om.request.volume.acl;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.request.OMClientRequest;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.volume.OMVolumeAddAclResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status;
+
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.apache.hadoop.ozone.security.acl.OzoneObjInfo;
+import org.apache.hadoop.utils.db.cache.CacheKey;
+import org.apache.hadoop.utils.db.cache.CacheValue;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.VOLUME_LOCK;
+
+/**
+ * Handles add acl request.
+ */
+public class OMVolumeAddAclRequest extends OMClientRequest {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMVolumeAddAclRequest.class);
+
+  public OMVolumeAddAclRequest(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long transactionLogIndex) {
+AddAclRequest addAclRequest = getOmRequest().getAddAclRequest();
+Preconditions.checkNotNull(addAclRequest);
+
+OMResponse.Builder omResponse = OMResponse.newBuilder().setCmdType(
+OzoneManagerProtocolProtos.Type.AddAcl).setStatus(
+Status.OK).setSuccess(true);
+
+if (!addAclRequest.hasAcl()) {
+  omResponse.setStatus(Status.INVALID_REQUEST).setSuccess(false);
+  return new OMVolumeAddAclResponse(null, omResponse.build());
+}
+
+OzoneObjInfo ozoneObj = OzoneObjInfo.fromProtobuf(addAclRequest.getObj());
+OzoneAcl ozoneAcl = OzoneAcl.fromProtobuf(addAclRequest.getAcl());
+String volume = ozoneObj.getVolumeName();
+
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumVolumeUpdates();
+IOException exception = null;
+OmVolumeArgs omVolumeArgs = null;
+
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+try {
+  // check Acl
+  if (ozoneManager.getAclsEnabled()) {
 
 Review comment:
   Do you mean Kerberos? ACL can be used without Kerberos. But generally, ACL 
only makes sense when you can securely identify a user. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above

[jira] [Work logged] (HDDS-1619) Support volume addACL operations for OM HA.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1619?focusedWorklogId=283102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283102
 ]

ASF GitHub Bot logged work on HDDS-1619:


Author: ASF GitHub Bot
Created on: 26/Jul/19 05:01
Start Date: 26/Jul/19 05:01
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #1147: HDDS-1619. 
Support volume addACL operations for OM HA. Contributed by…
URL: https://github.com/apache/hadoop/pull/1147#discussion_r307588979
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/acl/OMVolumeAddAclRequest.java
 ##
 @@ -0,0 +1,126 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.om.request.volume.acl;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.request.OMClientRequest;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.volume.OMVolumeAddAclResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.AddAclResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMRequest;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.OMResponse;
+import 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status;
+
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.apache.hadoop.ozone.security.acl.OzoneObjInfo;
+import org.apache.hadoop.utils.db.cache.CacheKey;
+import org.apache.hadoop.utils.db.cache.CacheValue;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.VOLUME_LOCK;
+
+/**
+ * Handles add acl request.
+ */
+public class OMVolumeAddAclRequest extends OMClientRequest {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMVolumeAddAclRequest.class);
+
+  public OMVolumeAddAclRequest(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long transactionLogIndex) {
+AddAclRequest addAclRequest = getOmRequest().getAddAclRequest();
+Preconditions.checkNotNull(addAclRequest);
+
+OMResponse.Builder omResponse = OMResponse.newBuilder().setCmdType(
+OzoneManagerProtocolProtos.Type.AddAcl).setStatus(
+Status.OK).setSuccess(true);
+
+if (!addAclRequest.hasAcl()) {
+  omResponse.setStatus(Status.INVALID_REQUEST).setSuccess(false);
+  return new OMVolumeAddAclResponse(null, omResponse.build());
+}
+
+OzoneObjInfo ozoneObj = OzoneObjInfo.fromProtobuf(addAclRequest.getObj());
 
 Review comment:
   Can you elaborate on the OmVolumeArgs protobuf conversion? OzoneManager does 
not take VolumeArgs directly. All volume/bucket/key are wrapped into OzoneObj. 
Even though we have separate Requests, but OzoneObj from RPC layer has to be 
converted to specific structure. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283102)
Time Spent: 2h 20m  (was: 2h 10m)

> Support volume addACL operations for OM HA.
>

[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893313#comment-16893313
 ] 

He Xiaoqiao commented on HDFS-12703:


OK, Thanks [~xkrogen] for checking, I just file new JIRA HDFS-14672 and track 
the backport.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, 
> HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, 
> HDFS-12703.012.patch, HDFS-12703.013.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-25 Thread He Xiaoqiao (JIRA)

He Xiaoqiao created HDFS-14672:
--

 Summary: Backport HDFS-12703 to branch-2
 Key: HDFS-14672
 URL: https://issues.apache.org/jira/browse/HDFS-14672
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


Currently, `decommission monitor exception cause namenode fatal` is only in 
trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14669) TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails intermittently in trunk

2019-07-25 Thread qiang Liu (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893305#comment-16893305
 ] 

qiang Liu commented on HDFS-14669:
--

[~ayushtkn]

the function writeFile(FileSystem fs, int numFiles) meant to create "numFiles" 
files, but create them using the same name

I move the path generate line into the loop and add a number suffix to it, so 
files will have diffrent names

so to avoid race condition of scan and delete outdate block

> TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails 
> intermittently in trunk
> -
>
> Key: HDFS-14669
> URL: https://issues.apache.org/jira/browse/HDFS-14669
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: datanode
>Affects Versions: 3.2.0
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: scanner, test
> Attachments: HDFS-14669-trunk-001.patch, HDFS-14669-trunk.002.patch
>
>
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner#testDirectoryScannerInFederatedCluster
>  radomlly Failes because of write files of the same name, meaning intent to 
> write 2 files but  2 files are the same name, witch cause a race condition of 
> datanode delete block and the scan action count block.
>  
> Ref :: 
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1207/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testDirectoryScannerInFederatedCluster/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1839) Change topology sorting related logs in Pipeline from INFO to DEBUG

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1839?focusedWorklogId=283085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283085
 ]

ASF GitHub Bot logged work on HDDS-1839:


Author: ASF GitHub Bot
Created on: 26/Jul/19 03:54
Start Date: 26/Jul/19 03:54
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on issue #1158: HDDS-1839: Change 
topology sorting related logs in Pipeline from INFO to DEBUG
URL: https://github.com/apache/hadoop/pull/1158#issuecomment-515299050
 
 
   /retest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283085)
Time Spent: 1h  (was: 50m)

> Change topology sorting related logs in Pipeline from INFO to DEBUG
> ---
>
> Key: HDDS-1839
> URL: https://issues.apache.org/jira/browse/HDDS-1839
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.4.1
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This will avoid output like 
> {code}
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Serialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Deserialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1839) Change topology sorting related logs in Pipeline from INFO to DEBUG

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1839?focusedWorklogId=283086=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283086
 ]

ASF GitHub Bot logged work on HDDS-1839:


Author: ASF GitHub Bot
Created on: 26/Jul/19 03:54
Start Date: 26/Jul/19 03:54
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on issue #1158: HDDS-1839: Change 
topology sorting related logs in Pipeline from INFO to DEBUG
URL: https://github.com/apache/hadoop/pull/1158#issuecomment-515299074
 
 
   +1 pending Jenkins.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283086)
Time Spent: 1h 10m  (was: 1h)

> Change topology sorting related logs in Pipeline from INFO to DEBUG
> ---
>
> Key: HDDS-1839
> URL: https://issues.apache.org/jira/browse/HDDS-1839
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.4.1
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This will avoid output like 
> {code}
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Serialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Deserialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1833) RefCountedDB printing of stacktrace should be moved to trace logging

2019-07-25 Thread Siddharth Wagle (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893300#comment-16893300
 ] 

Siddharth Wagle commented on HDDS-1833:
---

[~eyang] I believe, slf4j does not eagerly compute the parameterized statement 
and so we are probably ok, but I will verify that, thanks for the pointer.

> RefCountedDB printing of stacktrace should be moved to trace logging
> 
>
> Key: HDDS-1833
> URL: https://issues.apache.org/jira/browse/HDDS-1833
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-1833.01.patch, HDDS-1833.02.patch, 
> HDDS-1833.03.patch
>
>
> RefCountedDB logs the stackTrace for both increment and decrement, this 
> pollutes the logs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14461) RBF: Fix intermittently failing kerberos related unit test

2019-07-25 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893294#comment-16893294
 ] 

Eric Yang edited comment on HDFS-14461 at 7/26/19 3:43 AM:
---

[~elgoiri] I think it is premature to start using PR.  I have outlined a number 
of short coming using PR in the dev mailing list.  We may want to wait for some 
of the out standing issues to close before recommending PR.
[~hexiaoqiao]
{quote}
1. is there any other way to wait keys until persisted rather than 
Thread.sleep(1000)?
{quote}

{code}
while (!file.exists()) {}
{code}

But it would be nicer, if you do it the way that [~crh] suggested.

{quote}
2. do we need to define configuration item `hadoop.http.authentication.*` at 
CommonConfigurationKeys?
{quote}

I think it goes to: CommonConfigurationKeysPublic.java.  Some keys are already 
there.  I think it's nice but optional.

{quote}
3. I am confused how TestRouterHttpDelegationToken test passed when pre-commit 
HDFS-13972?
{quote}

I think it was testing with anonymous allowed which implicitly passed through, 
but I can't be sure.

{quote}
4. it seems that NoAuthFilter is not effective anymore, and I try to delete it.
{quote}

Ok


was (Author: eyang):
[~elgoiri] I think it is premature to start using PR.  I have outlined a number 
of short coming using PR in the dev mailing list.  We may want to wait for some 
of the out standing issues to close before recommending PR.
[~hexiaoqiao]
{quote}
1. is there any other way to wait keys until persisted rather than 
Thread.sleep(1000)?
{quote}

{code}
while (!file.exists()) {}
{code}

{quote}
2. do we need to define configuration item `hadoop.http.authentication.*` at 
CommonConfigurationKeys?
{quote}

I think it goes to: CommonConfigurationKeysPublic.java.  Some keys are already 
there.  I think it's nice but optional.

{quote}
3. I am confused how TestRouterHttpDelegationToken test passed when pre-commit 
HDFS-13972?
{quote}

I think it was testing with anonymous allowed which implicitly passed through, 
but I can't be sure.

{quote}
4. it seems that NoAuthFilter is not effective anymore, and I try to delete it.
{quote}

Ok

> RBF: Fix intermittently failing kerberos related unit test
> --
>
> Key: HDFS-14461
> URL: https://issues.apache.org/jira/browse/HDFS-14461
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14461.001.patch
>
>
> TestRouterHttpDelegationToken#testGetDelegationToken fails intermittently. It 
> may be due to some race condition before using the keytab that's created for 
> testing.
>  
> {code:java}
>  Failed
> org.apache.hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken.testGetDelegationToken
>  Failing for the past 1 build (Since 
> [!https://builds.apache.org/static/1e9ab9cc/images/16x16/red.png! 
> #26721|https://builds.apache.org/job/PreCommit-HDFS-Build/26721/] )
>  [Took 89 
> ms.|https://builds.apache.org/job/PreCommit-HDFS-Build/26721/testReport/org.apache.hadoop.hdfs.server.federation.security/TestRouterHttpDelegationToken/testGetDelegationToken/history]
>   
>  Error Message
> org.apache.hadoop.security.KerberosAuthException: failure to login: for 
> principal: router/localh...@example.com from keytab 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-rbf/target/test/data/SecurityConfUtil/test.keytab
>  javax.security.auth.login.LoginException: Integrity check on decrypted field 
> failed (31) - PREAUTH_FAILED
> h3. Stacktrace
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.security.KerberosAuthException: failure to login: for 
> principal: router/localh...@example.com from keytab 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-rbf/target/test/data/SecurityConfUtil/test.keytab
>  javax.security.auth.login.LoginException: Integrity check on decrypted field 
> failed (31) - PREAUTH_FAILED at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) 
> at 
> org.apache.hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken.setup(TestRouterHttpDelegationToken.java:99)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
>

[jira] [Created] (HDDS-1865) Use "dfs.network.topology.aware.read.enable" to control both client side and OM side topology aware read logic

2019-07-25 Thread Sammi Chen (JIRA)

Sammi Chen created HDDS-1865:


 Summary: Use "dfs.network.topology.aware.read.enable" to control 
both client side and OM side topology aware read logic
 Key: HDDS-1865
 URL: https://issues.apache.org/jira/browse/HDDS-1865
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Sammi Chen
Assignee: Sammi Chen






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14461) RBF: Fix intermittently failing kerberos related unit test

2019-07-25 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893294#comment-16893294
 ] 

Eric Yang commented on HDFS-14461:
--

[~elgoiri] I think it is premature to start using PR.  I have outlined a number 
of short coming using PR in the dev mailing list.  We may want to wait for some 
of the out standing issues to close before recommending PR.
[~hexiaoqiao]
{quote}
1. is there any other way to wait keys until persisted rather than 
Thread.sleep(1000)?
{quote}

{code}
while (!file.exists()) {}
{code}

{quote}
2. do we need to define configuration item `hadoop.http.authentication.*` at 
CommonConfigurationKeys?
{quote}

I think it goes to: CommonConfigurationKeysPublic.java.  Some keys are already 
there.  I think it's nice but optional.

{quote}
3. I am confused how TestRouterHttpDelegationToken test passed when pre-commit 
HDFS-13972?
{quote}

I think it was testing with anonymous allowed which implicitly passed through, 
but I can't be sure.

{quote}
4. it seems that NoAuthFilter is not effective anymore, and I try to delete it.
{quote}

Ok

> RBF: Fix intermittently failing kerberos related unit test
> --
>
> Key: HDFS-14461
> URL: https://issues.apache.org/jira/browse/HDFS-14461
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14461.001.patch
>
>
> TestRouterHttpDelegationToken#testGetDelegationToken fails intermittently. It 
> may be due to some race condition before using the keytab that's created for 
> testing.
>  
> {code:java}
>  Failed
> org.apache.hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken.testGetDelegationToken
>  Failing for the past 1 build (Since 
> [!https://builds.apache.org/static/1e9ab9cc/images/16x16/red.png! 
> #26721|https://builds.apache.org/job/PreCommit-HDFS-Build/26721/] )
>  [Took 89 
> ms.|https://builds.apache.org/job/PreCommit-HDFS-Build/26721/testReport/org.apache.hadoop.hdfs.server.federation.security/TestRouterHttpDelegationToken/testGetDelegationToken/history]
>   
>  Error Message
> org.apache.hadoop.security.KerberosAuthException: failure to login: for 
> principal: router/localh...@example.com from keytab 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-rbf/target/test/data/SecurityConfUtil/test.keytab
>  javax.security.auth.login.LoginException: Integrity check on decrypted field 
> failed (31) - PREAUTH_FAILED
> h3. Stacktrace
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.security.KerberosAuthException: failure to login: for 
> principal: router/localh...@example.com from keytab 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-rbf/target/test/data/SecurityConfUtil/test.keytab
>  javax.security.auth.login.LoginException: Integrity check on decrypted field 
> failed (31) - PREAUTH_FAILED at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) 
> at 
> org.apache.hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken.setup(TestRouterHttpDelegationToken.java:99)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>  at 
>

[jira] [Work logged] (HDDS-1856) Merge HA and Non-HA code in OM

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1856?focusedWorklogId=283061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283061
 ]

ASF GitHub Bot logged work on HDDS-1856:


Author: ASF GitHub Bot
Created on: 26/Jul/19 03:13
Start Date: 26/Jul/19 03:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1166: HDDS-1856. Merge 
HA and Non-HA code in OM.
URL: https://github.com/apache/hadoop/pull/1166#issuecomment-515292094
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 101 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 25 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 683 | trunk passed |
   | +1 | compile | 391 | trunk passed |
   | +1 | checkstyle | 76 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 1037 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 174 | trunk passed |
   | 0 | spotbugs | 431 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 630 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 594 | the patch passed |
   | +1 | compile | 388 | the patch passed |
   | +1 | javac | 388 | the patch passed |
   | +1 | checkstyle | 81 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 745 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 171 | the patch passed |
   | -1 | findbugs | 442 | hadoop-ozone generated 21 new + 0 unchanged - 0 
fixed = 21 total (was 0) |
   ||| _ Other Tests _ |
   | -1 | unit | 329 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2408 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 41 | The patch does not generate ASF License warnings. |
   | | | 8680 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-ozone |
   |  |  Inconsistent synchronization of 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.readyFutureQueue; 
locked 50% of time  Unsynchronized access at OzoneManagerDoubleBuffer.java:50% 
of time  Unsynchronized access at OzoneManagerDoubleBuffer.java:[line 171] |
   |  |  Null pointer dereference of omClientResponse in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketCreateRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketCreateRequest.java:in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketCreateRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketCreateRequest.java:[line 175] |
   |  |  Null pointer dereference of omClientResponse in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketDeleteRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketDeleteRequest.java:in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketDeleteRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketDeleteRequest.java:[line 140] |
   |  |  Null pointer dereference of omClientResponse in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketSetPropertyRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketSetPropertyRequest.java:in 
org.apache.hadoop.ozone.om.request.bucket.OMBucketSetPropertyRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMBucketSetPropertyRequest.java:[line 186] |
   |  |  Null pointer dereference of omClientResponse in 
org.apache.hadoop.ozone.om.request.file.OMDirectoryCreateRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMDirectoryCreateRequest.java:in 
org.apache.hadoop.ozone.om.request.file.OMDirectoryCreateRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper) on exception path  Dereferenced at 
OMDirectoryCreateRequest.java:[line 194] |
   |  |  Possible null pointer dereference of omClientResponse in 
org.apache.hadoop.ozone.om.request.file.OMDirectoryCreateRequest.validateAndUpdateCache(OzoneManager,
 long, OzoneManagerDoubleBufferHelper)  Dereferenced at 
OMDirectoryCreateRequest.java:omClientResponse in

[jira] [Work logged] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1863?focusedWorklogId=283065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283065
 ]

ASF GitHub Bot logged work on HDDS-1863:


Author: ASF GitHub Bot
Created on: 26/Jul/19 03:19
Start Date: 26/Jul/19 03:19
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1167: HDDS-1863. Freon 
RandomKeyGenerator even if keySize is set to 0, it returns some random data to 
key.
URL: https://github.com/apache/hadoop/pull/1167#issuecomment-515293051
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 50 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 615 | trunk passed |
   | +1 | compile | 349 | trunk passed |
   | +1 | checkstyle | 62 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 790 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 144 | trunk passed |
   | 0 | spotbugs | 412 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 598 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 525 | the patch passed |
   | +1 | compile | 342 | the patch passed |
   | +1 | javac | 342 | the patch passed |
   | +1 | checkstyle | 63 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 619 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 141 | the patch passed |
   | +1 | findbugs | 619 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 293 | hadoop-hdds in the patch failed. |
   | -1 | unit | 1822 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 36 | The patch does not generate ASF License warnings. |
   | | | 7233 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.TestStorageContainerManager |
   |   | hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures |
   |   | hadoop.ozone.om.TestScmSafeMode |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   |   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1167/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1167 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux a3d800fa42c7 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 6b8107a |
   | Default Java | 1.8.0_212 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1167/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1167/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1167/1/testReport/ |
   | Max. process+thread count | 4662 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/tools U: hadoop-ozone/tools |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1167/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283065)
Time Spent: 20m  (was: 10m)

> Freon RandomKeyGenerator even if keySize is set to 0, it returns some random 
> data to key
> 
>
> Key: HDDS-1863
>

[jira] [Created] (HDDS-1864) Turn on topology aware read in TestFailureHandlingByClient

2019-07-25 Thread Sammi Chen (JIRA)

Sammi Chen created HDDS-1864:


 Summary: Turn on topology aware read in TestFailureHandlingByClient
 Key: HDDS-1864
 URL: https://issues.apache.org/jira/browse/HDDS-1864
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Sammi Chen
Assignee: Sammi Chen






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1864) Turn on topology aware read in TestFailureHandlingByClient

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1864:
-
Labels: pull-request-available  (was: )

> Turn on topology aware read in TestFailureHandlingByClient
> --
>
> Key: HDDS-1864
> URL: https://issues.apache.org/jira/browse/HDDS-1864
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1864) Turn on topology aware read in TestFailureHandlingByClient

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1864?focusedWorklogId=283072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283072
 ]

ASF GitHub Bot logged work on HDDS-1864:


Author: ASF GitHub Bot
Created on: 26/Jul/19 03:29
Start Date: 26/Jul/19 03:29
Worklog Time Spent: 10m 
  Work Description: ChenSammi commented on pull request #1168: HDDS-1864. 
Turn on topology aware read in TestFailureHandlingByClient.
URL: https://github.com/apache/hadoop/pull/1168
 
 
   https://issues.apache.org/jira/browse/HDDS-1864. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283072)
Time Spent: 10m
Remaining Estimate: 0h

> Turn on topology aware read in TestFailureHandlingByClient
> --
>
> Key: HDDS-1864
> URL: https://issues.apache.org/jira/browse/HDDS-1864
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14669) TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails intermittently in trunk

2019-07-25 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893283#comment-16893283
 ] 

Ayush Saxena commented on HDFS-14669:
-

Thanx [~iamgd67] for the patch,
Can you explain a bit about the issue and how it gets sorted by moving the path 
initialization inside the loop.
You need to rebase, I think

> TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails 
> intermittently in trunk
> -
>
> Key: HDFS-14669
> URL: https://issues.apache.org/jira/browse/HDFS-14669
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: datanode
>Affects Versions: 3.2.0
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: scanner, test
> Attachments: HDFS-14669-trunk-001.patch, HDFS-14669-trunk.002.patch
>
>
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner#testDirectoryScannerInFederatedCluster
>  radomlly Failes because of write files of the same name, meaning intent to 
> write 2 files but  2 files are the same name, witch cause a race condition of 
> datanode delete block and the scan action count block.
>  
> Ref :: 
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1207/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testDirectoryScannerInFederatedCluster/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1707) SCMContainerPlacementRackAware#chooseDatanodes throws not enough datanodes when all nodes(40) are up

2019-07-25 Thread Sammi Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893282#comment-16893282
 ] 

Sammi Chen commented on HDDS-1707:
--

Thanks [~msingh] for report his. It been fixed by code change in HDDS-1713. 

> SCMContainerPlacementRackAware#chooseDatanodes throws not enough datanodes 
> when all nodes(40) are up
> 
>
> Key: HDDS-1707
> URL: https://issues.apache.org/jira/browse/HDDS-1707
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Mukul Kumar Singh
>Priority: Major
>
> SCMContainerPlacementRackAware#chooseDatanodes is failing with the following 
> error repeatedly.
> {code}
> 2019-06-17 22:15:52,455 WARN 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Exception while 
> replicating container 407.
> org.apache.hadoop.hdds.scm.exceptions.SCMException: No enough datanodes to 
> choose.
> at 
> org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:100)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
> at 
> java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4649)
> at 
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-1707) SCMContainerPlacementRackAware#chooseDatanodes throws not enough datanodes when all nodes(40) are up

2019-07-25 Thread Sammi Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen resolved HDDS-1707.
--
Resolution: Fixed
  Assignee: Sammi Chen

> SCMContainerPlacementRackAware#chooseDatanodes throws not enough datanodes 
> when all nodes(40) are up
> 
>
> Key: HDDS-1707
> URL: https://issues.apache.org/jira/browse/HDDS-1707
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Mukul Kumar Singh
>Assignee: Sammi Chen
>Priority: Major
>
> SCMContainerPlacementRackAware#chooseDatanodes is failing with the following 
> error repeatedly.
> {code}
> 2019-06-17 22:15:52,455 WARN 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Exception while 
> replicating container 407.
> org.apache.hadoop.hdds.scm.exceptions.SCMException: No enough datanodes to 
> choose.
> at 
> org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:100)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
> at 
> java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4649)
> at 
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14303) check block directory logic not correct when there is only meta file, print no meaning warn log

2019-07-25 Thread Ayush Saxena (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14303:

  Resolution: Fixed
Target Version/s: 2.9.2, 3.2.0  (was: 3.2.0, 2.9.2)
  Status: Resolved  (was: Patch Available)

> check block directory logic not correct when there is only meta file, print 
> no meaning warn log
> ---
>
> Key: HDFS-14303
> URL: https://issues.apache.org/jira/browse/HDFS-14303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 2.7.3, 3.2.0, 2.9.2, 2.8.5
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: easy-fix
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-14303-addendum-01.patch, 
> HDFS-14303-addendum-02.patch, HDFS-14303-branch-2.005.patch, 
> HDFS-14303-branch-2.009.patch, HDFS-14303-branch-2.010.patch, 
> HDFS-14303-branch-2.015.patch, HDFS-14303-branch-2.017.patch, 
> HDFS-14303-branch-2.7.001.patch, HDFS-14303-branch-2.7.004.patch, 
> HDFS-14303-branch-2.7.006.patch, HDFS-14303-branch-2.9.011.patch, 
> HDFS-14303-branch-2.9.012.patch, HDFS-14303-branch-2.9.013.patch, 
> HDFS-14303-trunk.014.patch, HDFS-14303-trunk.015.patch, 
> HDFS-14303-trunk.016.patch, HDFS-14303-trunk.016.path, 
> HDFS-14303.branch-3.2.017.patch
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> chek block directory logic not correct when there is only meta file,print no 
> meaning warn log, eg:
>  WARN DirectoryScanner:? - Block: 1101939874 has to be upgraded to block 
> ID-based layout. Actual block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68,
>  expected block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68/subdir68



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14303) check block directory logic not correct when there is only meta file, print no meaning warn log

2019-07-25 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893281#comment-16893281
 ] 

Ayush Saxena commented on HDFS-14303:
-

Committed addendum v2 to trunk.
Thanx [~iamgd67]

> check block directory logic not correct when there is only meta file, print 
> no meaning warn log
> ---
>
> Key: HDFS-14303
> URL: https://issues.apache.org/jira/browse/HDFS-14303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 2.7.3, 3.2.0, 2.9.2, 2.8.5
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: easy-fix
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-14303-addendum-01.patch, 
> HDFS-14303-addendum-02.patch, HDFS-14303-branch-2.005.patch, 
> HDFS-14303-branch-2.009.patch, HDFS-14303-branch-2.010.patch, 
> HDFS-14303-branch-2.015.patch, HDFS-14303-branch-2.017.patch, 
> HDFS-14303-branch-2.7.001.patch, HDFS-14303-branch-2.7.004.patch, 
> HDFS-14303-branch-2.7.006.patch, HDFS-14303-branch-2.9.011.patch, 
> HDFS-14303-branch-2.9.012.patch, HDFS-14303-branch-2.9.013.patch, 
> HDFS-14303-trunk.014.patch, HDFS-14303-trunk.015.patch, 
> HDFS-14303-trunk.016.patch, HDFS-14303-trunk.016.path, 
> HDFS-14303.branch-3.2.017.patch
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> chek block directory logic not correct when there is only meta file,print no 
> meaning warn log, eg:
>  WARN DirectoryScanner:? - Block: 1101939874 has to be upgraded to block 
> ID-based layout. Actual block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68,
>  expected block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68/subdir68



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14303) check block directory logic not correct when there is only meta file, print no meaning warn log

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893278#comment-16893278
 ] 

Hudson commented on HDFS-14303:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16988 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16988/])
HDFS-14303. Addendum: check block directory logic not correct when there 
(ayushsaxena: rev ce99cc31e9c34504669c30b160eb55c7cacd9966)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java


> check block directory logic not correct when there is only meta file, print 
> no meaning warn log
> ---
>
> Key: HDFS-14303
> URL: https://issues.apache.org/jira/browse/HDFS-14303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 2.7.3, 3.2.0, 2.9.2, 2.8.5
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: easy-fix
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-14303-addendum-01.patch, 
> HDFS-14303-addendum-02.patch, HDFS-14303-branch-2.005.patch, 
> HDFS-14303-branch-2.009.patch, HDFS-14303-branch-2.010.patch, 
> HDFS-14303-branch-2.015.patch, HDFS-14303-branch-2.017.patch, 
> HDFS-14303-branch-2.7.001.patch, HDFS-14303-branch-2.7.004.patch, 
> HDFS-14303-branch-2.7.006.patch, HDFS-14303-branch-2.9.011.patch, 
> HDFS-14303-branch-2.9.012.patch, HDFS-14303-branch-2.9.013.patch, 
> HDFS-14303-trunk.014.patch, HDFS-14303-trunk.015.patch, 
> HDFS-14303-trunk.016.patch, HDFS-14303-trunk.016.path, 
> HDFS-14303.branch-3.2.017.patch
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> chek block directory logic not correct when there is only meta file,print no 
> meaning warn log, eg:
>  WARN DirectoryScanner:? - Block: 1101939874 has to be upgraded to block 
> ID-based layout. Actual block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68,
>  expected block file path: 
> /data14/hadoop/data/current/BP-1461038173-10.8.48.152-1481686842620/current/finalized/subdir174/subdir68/subdir68



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-1809) Ozone Read fails with StatusRunTimeExceptions after 2 datanode fail in Ratis pipeline

2019-07-25 Thread Sammi Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen resolved HDDS-1809.
--
Resolution: Fixed

> Ozone Read fails with StatusRunTimeExceptions after 2 datanode fail in Ratis 
> pipeline
> -
>
> Key: HDDS-1809
> URL: https://issues.apache.org/jira/browse/HDDS-1809
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Sammi Chen
>Priority: Major
> Fix For: 0.5.0
>
>
> {code:java}
> java.io.IOException: Unexpected OzoneException: java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunk(ChunkInputStream.java:342)
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunkFromContainer(ChunkInputStream.java:307)
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.prepareRead(ChunkInputStream.java:259)
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.read(ChunkInputStream.java:144)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:239)
> at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:171)
> at 
> org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:47)
> at java.io.InputStream.read(InputStream.java:101)
> at 
> org.apache.hadoop.ozone.container.ContainerTestHelper.validateData(ContainerTestHelper.java:709)
> at 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClient.validateData(TestFailureHandlingByClient.java:458)
> at 
> org.apache.hadoop.ozone.client.rpc.TestFailureHandlingByClient.testBlockWritesWithDnFailures(TestFailureHandlingByClient.java:158)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1855) TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing

2019-07-25 Thread Sammi Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-1855:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing
> --
>
> Key: HDDS-1855
> URL: https://issues.apache.org/jira/browse/HDDS-1855
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestStorageContainerManager#testScmProcessDatanodeHeartbeat}} is failing 
> with the following exception
> {noformat}
> [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 106.315 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.TestStorageContainerManager
> [ERROR] 
> testScmProcessDatanodeHeartbeat(org.apache.hadoop.ozone.TestStorageContainerManager)
>   Time elapsed: 21.97 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.TestStorageContainerManager.testScmProcessDatanodeHeartbeat(TestStorageContainerManager.java:531)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1855) TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893275#comment-16893275
 ] 

Hudson commented on HDDS-1855:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16987 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16987/])
HDDS-1855. TestStorageContainerManager#testScmProcessDatanodeHeartbeat 
(sammichen: rev a2cc961086928168f8149273d9d5bcb66055b138)
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java


> TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing
> --
>
> Key: HDDS-1855
> URL: https://issues.apache.org/jira/browse/HDDS-1855
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestStorageContainerManager#testScmProcessDatanodeHeartbeat}} is failing 
> with the following exception
> {noformat}
> [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 106.315 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.TestStorageContainerManager
> [ERROR] 
> testScmProcessDatanodeHeartbeat(org.apache.hadoop.ozone.TestStorageContainerManager)
>   Time elapsed: 21.97 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.TestStorageContainerManager.testScmProcessDatanodeHeartbeat(TestStorageContainerManager.java:531)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1855) TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1855?focusedWorklogId=283057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283057
 ]

ASF GitHub Bot logged work on HDDS-1855:


Author: ASF GitHub Bot
Created on: 26/Jul/19 02:54
Start Date: 26/Jul/19 02:54
Worklog Time Spent: 10m 
  Work Description: ChenSammi commented on pull request #1153: HDDS-1855. 
TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing.
URL: https://github.com/apache/hadoop/pull/1153
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283057)
Time Spent: 0.5h  (was: 20m)

> TestStorageContainerManager#testScmProcessDatanodeHeartbeat is failing
> --
>
> Key: HDDS-1855
> URL: https://issues.apache.org/jira/browse/HDDS-1855
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestStorageContainerManager#testScmProcessDatanodeHeartbeat}} is failing 
> with the following exception
> {noformat}
> [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 106.315 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.TestStorageContainerManager
> [ERROR] 
> testScmProcessDatanodeHeartbeat(org.apache.hadoop.ozone.TestStorageContainerManager)
>   Time elapsed: 21.97 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.TestStorageContainerManager.testScmProcessDatanodeHeartbeat(TestStorageContainerManager.java:531)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14670) RBF: Create secret manager instance using FederationUtil#newInstance.

2019-07-25 Thread Ayush Saxena (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14670:

Status: Patch Available  (was: Open)

> RBF: Create secret manager instance using FederationUtil#newInstance.
> -
>
> Key: HDFS-14670
> URL: https://issues.apache.org/jira/browse/HDFS-14670
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
>
> Since HDFS-14577 is done, as discussed in tha ticket, security and isolation 
> work will use this. This ticket is tracking the work around security class 
> instantiation.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1839) Change topology sorting related logs in Pipeline from INFO to DEBUG

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1839?focusedWorklogId=283042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283042
 ]

ASF GitHub Bot logged work on HDDS-1839:


Author: ASF GitHub Bot
Created on: 26/Jul/19 01:24
Start Date: 26/Jul/19 01:24
Worklog Time Spent: 10m 
  Work Description: chenjunjiedada commented on pull request #1158: 
HDDS-1839: Change topology sorting related logs in Pipeline from INFO to DEBUG
URL: https://github.com/apache/hadoop/pull/1158#discussion_r307557943
 
 

 ##
 File path: hadoop-hdds/container-service/src/main/resources/ozone-site.xml
 ##
 @@ -0,0 +1,11 @@
+
 
 Review comment:
   Oh, my bad. Will delete it. Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283042)
Time Spent: 50m  (was: 40m)

> Change topology sorting related logs in Pipeline from INFO to DEBUG
> ---
>
> Key: HDDS-1839
> URL: https://issues.apache.org/jira/browse/HDDS-1839
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.4.1
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This will avoid output like 
> {code}
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Serialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Deserialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1839) Change topology sorting related logs in Pipeline from INFO to DEBUG

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1839?focusedWorklogId=283041=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283041
 ]

ASF GitHub Bot logged work on HDDS-1839:


Author: ASF GitHub Bot
Created on: 26/Jul/19 01:24
Start Date: 26/Jul/19 01:24
Worklog Time Spent: 10m 
  Work Description: chenjunjiedada commented on pull request #1158: 
HDDS-1839: Change topology sorting related logs in Pipeline from INFO to DEBUG
URL: https://github.com/apache/hadoop/pull/1158#discussion_r307557939
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/pipeline/Pipeline.java
 ##
 @@ -339,7 +339,7 @@ public Pipeline build() {
 nodeIndex--;
   }
 }
-LOG.info("Deserialize nodesInOrder {} in pipeline {}", nodesWithOrder,
+LOG.debug("Deserialize nodesInOrder {} in pipeline {}", nodesWithOrder,
 
 Review comment:
   Sure, np.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283041)
Time Spent: 40m  (was: 0.5h)

> Change topology sorting related logs in Pipeline from INFO to DEBUG
> ---
>
> Key: HDDS-1839
> URL: https://issues.apache.org/jira/browse/HDDS-1839
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.4.1
>Reporter: Xiaoyu Yao
>Assignee: Junjie Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This will avoid output like 
> {code}
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Serialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> 2019-07-19 22:36:40 INFO  Pipeline:342 - Deserialize nodesInOrder 
> [610d4084-7cce-4691-b43a-f9dd5cdb8809\{ip: 192.168.144.3, host: 
> ozonesecure-mr_datanode_1.ozonesecure-mr_default, networkLocation: 
> /default-rack, certSerialId: null}] in pipeline 
> PipelineID=f9ba269c-aba9-4a42-946c-4048d02cb7d1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1863:
-
Labels: pull-request-available  (was: )

> Freon RandomKeyGenerator even if keySize is set to 0, it returns some random 
> data to key
> 
>
> Key: HDDS-1863
> URL: https://issues.apache.org/jira/browse/HDDS-1863
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
>  
> {code:java}
> ***
> Status: Success
> Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
> Number of Volumes created: 1
> Number of Buckets created: 1
> Number of Keys added: 1
> Ratis replication factor: THREE
> Ratis replication type: STAND_ALONE
> Average Time spent in volume creation: 00:00:00,002
> Average Time spent in bucket creation: 00:00:00,000
> Average Time spent in key creation: 00:00:00,002
> Average Time spent in key write: 00:00:00,101
> Total bytes written: 0
> Total Execution time: 00:00:05,699
>  
> {code}
> ***
> [root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
> /vol-0-28271/bucket-0-95211
> [
> {   "version" : 0,   "md5hash" : null,   "createdOn" : "Fri, 26 Jul 2019 
> 01:02:08 GMT",   "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",   "size" : 
> 36,   "keyName" : "key-0-98235",   "type" : null }
> ]
>  
> This is because of the below code in RandomKeyGenerator:
> {code:java}
> for (long nrRemaining = keySize - randomValue.length;
>  nrRemaining > 0; nrRemaining -= bufferSize) {
>  int curSize = (int) Math.min(bufferSize, nrRemaining);
>  os.write(keyValueBuffer, 0, curSize);
> }
> os.write(randomValue);
> os.close();{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893243#comment-16893243
 ] 

Hudson commented on HDFS-14135:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16986 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16986/])
HDFS-14135. TestWebHdfsTimeouts Fails intermittently in trunk. (iwasakims: rev 
6b8107ad97251267253fa045ba03c4749f95f530)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTimeouts.java


> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1863?focusedWorklogId=283038=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283038
 ]

ASF GitHub Bot logged work on HDDS-1863:


Author: ASF GitHub Bot
Created on: 26/Jul/19 01:13
Start Date: 26/Jul/19 01:13
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1167: 
HDDS-1863. Freon RandomKeyGenerator even if keySize is set to 0, it returns 
some random data to key.
URL: https://github.com/apache/hadoop/pull/1167
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283038)
Time Spent: 10m
Remaining Estimate: 0h

> Freon RandomKeyGenerator even if keySize is set to 0, it returns some random 
> data to key
> 
>
> Key: HDDS-1863
> URL: https://issues.apache.org/jira/browse/HDDS-1863
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> ***
> Status: Success
> Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
> Number of Volumes created: 1
> Number of Buckets created: 1
> Number of Keys added: 1
> Ratis replication factor: THREE
> Ratis replication type: STAND_ALONE
> Average Time spent in volume creation: 00:00:00,002
> Average Time spent in bucket creation: 00:00:00,000
> Average Time spent in key creation: 00:00:00,002
> Average Time spent in key write: 00:00:00,101
> Total bytes written: 0
> Total Execution time: 00:00:05,699
>  
> {code}
> ***
> [root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
> /vol-0-28271/bucket-0-95211
> [
> {   "version" : 0,   "md5hash" : null,   "createdOn" : "Fri, 26 Jul 2019 
> 01:02:08 GMT",   "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",   "size" : 
> 36,   "keyName" : "key-0-98235",   "type" : null }
> ]
>  
> This is because of the below code in RandomKeyGenerator:
> {code:java}
> for (long nrRemaining = keySize - randomValue.length;
>  nrRemaining > 0; nrRemaining -= bufferSize) {
>  int curSize = (int) Math.min(bufferSize, nrRemaining);
>  os.write(keyValueBuffer, 0, curSize);
> }
> os.write(randomValue);
> os.close();{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1863:
-
Status: Patch Available  (was: Open)

> Freon RandomKeyGenerator even if keySize is set to 0, it returns some random 
> data to key
> 
>
> Key: HDDS-1863
> URL: https://issues.apache.org/jira/browse/HDDS-1863
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
>  
> {code:java}
> ***
> Status: Success
> Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
> Number of Volumes created: 1
> Number of Buckets created: 1
> Number of Keys added: 1
> Ratis replication factor: THREE
> Ratis replication type: STAND_ALONE
> Average Time spent in volume creation: 00:00:00,002
> Average Time spent in bucket creation: 00:00:00,000
> Average Time spent in key creation: 00:00:00,002
> Average Time spent in key write: 00:00:00,101
> Total bytes written: 0
> Total Execution time: 00:00:05,699
>  
> {code}
> ***
> [root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
> /vol-0-28271/bucket-0-95211
> [
> {   "version" : 0,   "md5hash" : null,   "createdOn" : "Fri, 26 Jul 2019 
> 01:02:08 GMT",   "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",   "size" : 
> 36,   "keyName" : "key-0-98235",   "type" : null }
> ]
>  
> This is because of the below code in RandomKeyGenerator:
> {code:java}
> for (long nrRemaining = keySize - randomValue.length;
>  nrRemaining > 0; nrRemaining -= bufferSize) {
>  int curSize = (int) Math.min(bufferSize, nrRemaining);
>  os.write(keyValueBuffer, 0, curSize);
> }
> os.write(randomValue);
> os.close();{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1863:
-
Description: 
 
{code:java}
***
Status: Success
Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Number of Volumes created: 1
Number of Buckets created: 1
Number of Keys added: 1
Ratis replication factor: THREE
Ratis replication type: STAND_ALONE
Average Time spent in volume creation: 00:00:00,002
Average Time spent in bucket creation: 00:00:00,000
Average Time spent in key creation: 00:00:00,002
Average Time spent in key write: 00:00:00,101
Total bytes written: 0
Total Execution time: 00:00:05,699
 
{code}
***

[root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
/vol-0-28271/bucket-0-95211

[

{   "version" : 0,   "md5hash" : null,   "createdOn" : "Fri, 26 Jul 2019 
01:02:08 GMT",   "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",   "size" : 36, 
  "keyName" : "key-0-98235",   "type" : null }

]

 

This is because of the below code in RandomKeyGenerator:
{code:java}
for (long nrRemaining = keySize - randomValue.length;
 nrRemaining > 0; nrRemaining -= bufferSize) {
 int curSize = (int) Math.min(bufferSize, nrRemaining);
 os.write(keyValueBuffer, 0, curSize);
}
os.write(randomValue);
os.close();{code}
 

  was:
 
{code:java}
***
Status: Success
Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Number of Volumes created: 1
Number of Buckets created: 1
Number of Keys added: 1
Ratis replication factor: THREE
Ratis replication type: STAND_ALONE
Average Time spent in volume creation: 00:00:00,002
Average Time spent in bucket creation: 00:00:00,000
Average Time spent in key creation: 00:00:00,002
Average Time spent in key write: 00:00:00,101
Total bytes written: 0
Total Execution time: 00:00:05,699
 
{code}
***

[root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
/vol-0-28271/bucket-0-95211

[ {

  "version" : 0,

  "md5hash" : null,

  "createdOn" : "Fri, 26 Jul 2019 01:02:08 GMT",

  "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",

  "size" : 36,

  "keyName" : "key-0-98235",

  "type" : null

} ]


> Freon RandomKeyGenerator even if keySize is set to 0, it returns some random 
> data to key
> 
>
> Key: HDDS-1863
> URL: https://issues.apache.org/jira/browse/HDDS-1863
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
>  
> {code:java}
> ***
> Status: Success
> Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
> Number of Volumes created: 1
> Number of Buckets created: 1
> Number of Keys added: 1
> Ratis replication factor: THREE
> Ratis replication type: STAND_ALONE
> Average Time spent in volume creation: 00:00:00,002
> Average Time spent in bucket creation: 00:00:00,000
> Average Time spent in key creation: 00:00:00,002
> Average Time spent in key write: 00:00:00,101
> Total bytes written: 0
> Total Execution time: 00:00:05,699
>  
> {code}
> ***
> [root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
> /vol-0-28271/bucket-0-95211
> [
> {   "version" : 0,   "md5hash" : null,   "createdOn" : "Fri, 26 Jul 2019 
> 01:02:08 GMT",   "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",   "size" : 
> 36,   "keyName" : "key-0-98235",   "type" : null }
> ]
>  
> This is because of the below code in RandomKeyGenerator:
> {code:java}
> for (long nrRemaining = keySize - randomValue.length;
>  nrRemaining > 0; nrRemaining -= bufferSize) {
>  int curSize = (int) Math.min(bufferSize, nrRemaining);
>  os.write(keyValueBuffer, 0, curSize);
> }
> os.write(randomValue);
> os.close();{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1863) Freon RandomKeyGenerator even if keySize is set to 0, it returns some random data to key

2019-07-25 Thread Bharat Viswanadham (JIRA)

Bharat Viswanadham created HDDS-1863:


 Summary: Freon RandomKeyGenerator even if keySize is set to 0, it 
returns some random data to key
 Key: HDDS-1863
 URL: https://issues.apache.org/jira/browse/HDDS-1863
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


 
{code:java}
***
Status: Success
Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Number of Volumes created: 1
Number of Buckets created: 1
Number of Keys added: 1
Ratis replication factor: THREE
Ratis replication type: STAND_ALONE
Average Time spent in volume creation: 00:00:00,002
Average Time spent in bucket creation: 00:00:00,000
Average Time spent in key creation: 00:00:00,002
Average Time spent in key write: 00:00:00,101
Total bytes written: 0
Total Execution time: 00:00:05,699
 
{code}
***

[root@ozoneha-2 ozone-0.5.0-SNAPSHOT]# bin/ozone sh key list 
/vol-0-28271/bucket-0-95211

[ {

  "version" : 0,

  "md5hash" : null,

  "createdOn" : "Fri, 26 Jul 2019 01:02:08 GMT",

  "modifiedOn" : "Fri, 26 Jul 2019 01:02:09 GMT",

  "size" : 36,

  "keyName" : "key-0-98235",

  "type" : null

} ]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14669) TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails intermittently in trunk

2019-07-25 Thread qiang Liu (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893242#comment-16893242
 ] 

qiang Liu commented on HDFS-14669:
--

[~ayushtkn] could you please help me check why this Jira not triggering 
[~hadoopqa] build on it

> TestDirectoryScanner#testDirectoryScannerInFederatedCluster fails 
> intermittently in trunk
> -
>
> Key: HDFS-14669
> URL: https://issues.apache.org/jira/browse/HDFS-14669
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: datanode
>Affects Versions: 3.2.0
> Environment: env free
>Reporter: qiang Liu
>Assignee: qiang Liu
>Priority: Minor
>  Labels: scanner, test
> Attachments: HDFS-14669-trunk-001.patch, HDFS-14669-trunk.002.patch
>
>
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner#testDirectoryScannerInFederatedCluster
>  radomlly Failes because of write files of the same name, meaning intent to 
> write 2 files but  2 files are the same name, witch cause a race condition of 
> datanode delete block and the scan action count block.
>  
> Ref :: 
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1207/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testDirectoryScannerInFederatedCluster/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13322) fuse dfs - uid persists when switching between ticket caches

2019-07-25 Thread Istvan Fajth (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893236#comment-16893236
 ] 

Istvan Fajth commented on HDFS-13322:
-

To document a new learning point regarding this change, I would like to add the 
following information:

FUSE is figuring out the environment of the caller based on its pid that is 
passed on to the FUSE code from the kernel in the FuseContext structure. With 
the pid fuse-dfs is turning to /proc/(context->pid)/environ file to find the 
KRB5CCNAME environment variable.

After the change the connection cache keys are consists of a (username, 
kerberos ticket cache path) pair if authentication is set to KERBEROS, so for 
every pair we hold a different connection build with the ticket cache path, and 
we use that later on. With SIMPLE authentication, the ticket cache path is 
presented in the pair as the \0 character always.

For this the ticket cache pathin case of KERBEROS authentication is being read 
from /proc/(context->pid)/environ on every access.

In the Linux [proc file system man 
page|http://man7.org/linux/man-pages/man5/proc.5.html] the following is written 
for /proc/[pid]/environ:
{quote}This file contains the *initial* environment that was set when the 
currently executing program was started via execve(2).{quote}

This can lead to odd behaviors in case the access is not happening in a new 
process but it is part of a process that exported the KRB5CCNAME environment 
variable. So for example in a shell when executing the following commands, FUSE 
will not be able to read the KRB5CCNAME variable from the 
/proc/(context->pid)/environ file:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ echo "foo" > /mnt/hdfs/tmp/foo.txt{code}
This is because in this case echo is happening in the shell, and the shell's 
process id will be there in context->pid, and the /proc/(context->pid)/environ 
file will not contain the environment variable KRB5CCNAME as it is not part of 
the initial environment.

In the meantime the following will work because cp will be a new process which 
inherits the environment from the current shell:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ echo "foo" > /tmp/foo.txt
$ cp /tmp/foo.txt /mnt/hdfs/tmp/foo.txt{code}
 

To workaround this behaviour, the caller has to ensure that the initial 
environment of every accessing process has the correct KRB5CCNAME set. So for 
example the echo example would work correctly the following way:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ /bin/sh
$ echo "foo" > /mnt/hdfs/tmp/foo.txt
$ exit{code}

> fuse dfs - uid persists when switching between ticket caches
> 
>
> Key: HDFS-13322
> URL: https://issues.apache.org/jira/browse/HDFS-13322
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 2.6.0
> Environment: Linux xx.xx.xx.xxx 3.10.0-514.el7.x86_64 #1 SMP Wed 
> Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>  
>Reporter: Shoeb Sheyx
>Assignee: Istvan Fajth
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HDFS-13322.001.patch, HDFS-13322.002.patch, 
> HDFS-13322.003.patch, TestFuse.java, TestFuse2.java, catter.sh, catter2.sh, 
> perftest_new_behaviour_10k_different_1KB.txt, perftest_new_behaviour_1B.txt, 
> perftest_new_behaviour_1KB.txt, perftest_new_behaviour_1MB.txt, 
> perftest_old_behaviour_10k_different_1KB.txt, perftest_old_behaviour_1B.txt, 
> perftest_old_behaviour_1KB.txt, perftest_old_behaviour_1MB.txt, 
> testHDFS-13322.sh, test_after_patch.out, test_before_patch.out
>
>
> The symptoms of this issue are the same as described in HDFS-3608 except the 
> workaround that was applied (detect changes in UID ticket cache) doesn't 
> resolve the issue when multiple ticket caches are in use by the same user.
> Our use case requires that a job scheduler running as a specific uid obtain 
> separate kerberos sessions per job and that each of these sessions use a 
> separate cache. When switching sessions this way, no change is made to the 
> original ticket cache so the cached filesystem instance doesn't get 
> regenerated.
>  
> {{$ export KRB5CCNAME=/tmp/krb5cc_session1}}
> {{$ kinit user_a@domain}}
> {{$ touch /fuse_mount/tmp/testfile1}}
> {{$ ls -l /fuse_mount/tmp/testfile1}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile1*}}
> {{$ export KRB5CCNAME=/tmp/krb5cc_session2}}
> {{$ kinit user_b@domain}}
> {{$ touch /fuse_mount/tmp/testfile2}}
> {{$ ls -l /fuse_mount/tmp/testfile2}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile2*}}
> {{   }}{color:#d04437}*{{** expected owner to be user_b **}}*{color}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-25 Thread Masatake Iwasaki (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893232#comment-16893232
 ] 

Masatake Iwasaki commented on HDFS-14135:
-

Thanks [~elgoiri]. I'm committing this.

> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14657) Refine NameSystem lock usage during processing FBR

2019-07-25 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893228#comment-16893228
 ] 

Konstantin Shvachko edited comment on HDFS-14657 at 7/26/19 12:29 AM:
--

Hi [~zhangchen].
Looking at your patch v2. I am not sure I understand how your approach can 
work. Suppose you release {{namesystem.writeUnlock()}} in 
{{reportDiffSorted()}}, then somebody (unrelated to block report processing) 
can modify blocks belonging to the storage, which will invalidates 
{{storageBlocksIterator}}, meaning calling {{next()}} will cause 
{{ConcurrentModificationException}}.


was (Author: shv):
Hi [~zhangchen].
Looking at your patch v2. I am not sure I understand how your approach works. 
Suppose you release {{namesystem.writeUnlock()}} in {{reportDiffSorted()}}, 
then somebody (unrelated to block report processing) can modify blocks 
belonging to the storage, which will invalidate {{storageBlocksIterator}}, 
meaning calling {{next()}} will cause {{ConcurrentModificationException}}.

> Refine NameSystem lock usage during processing FBR
> --
>
> Key: HDFS-14657
> URL: https://issues.apache.org/jira/browse/HDFS-14657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14657-001.patch, HDFS-14657.002.patch
>
>
> The disk with 12TB capacity is very normal today, which means the FBR size is 
> much larger than before, Namenode holds the NameSystemLock during processing 
> block report for each storage, which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer 
> RPC queue time, which impacts client latency, so we did some simple work on 
> refining the lock usage, which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request 
> it again for every 5000 blocks(by default) during processing FBR, with the 
> fair lock, all the RPC request can be processed before BlockManager 
> re-acquire the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14657) Refine NameSystem lock usage during processing FBR

2019-07-25 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893228#comment-16893228
 ] 

Konstantin Shvachko edited comment on HDFS-14657 at 7/26/19 12:29 AM:
--

Hi [~zhangchen].
Looking at your patch v2. I am not sure I understand how your approach can 
work. Suppose you release {{namesystem.writeUnlock()}} in 
{{reportDiffSorted()}}, then somebody (unrelated to block report processing) 
can modify blocks belonging to the storage, which invalidates 
{{storageBlocksIterator}}, meaning calling {{next()}} will cause 
{{ConcurrentModificationException}}.


was (Author: shv):
Hi [~zhangchen].
Looking at your patch v2. I am not sure I understand how your approach can 
work. Suppose you release {{namesystem.writeUnlock()}} in 
{{reportDiffSorted()}}, then somebody (unrelated to block report processing) 
can modify blocks belonging to the storage, which will invalidates 
{{storageBlocksIterator}}, meaning calling {{next()}} will cause 
{{ConcurrentModificationException}}.

> Refine NameSystem lock usage during processing FBR
> --
>
> Key: HDFS-14657
> URL: https://issues.apache.org/jira/browse/HDFS-14657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14657-001.patch, HDFS-14657.002.patch
>
>
> The disk with 12TB capacity is very normal today, which means the FBR size is 
> much larger than before, Namenode holds the NameSystemLock during processing 
> block report for each storage, which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer 
> RPC queue time, which impacts client latency, so we did some simple work on 
> refining the lock usage, which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request 
> it again for every 5000 blocks(by default) during processing FBR, with the 
> fair lock, all the RPC request can be processed before BlockManager 
> re-acquire the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1856) Merge HA and Non-HA code in OM

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1856?focusedWorklogId=283034=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283034
 ]

ASF GitHub Bot logged work on HDDS-1856:


Author: ASF GitHub Bot
Created on: 26/Jul/19 00:27
Start Date: 26/Jul/19 00:27
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #1166: HDDS-1856. 
Merge HA and Non-HA code in OM.
URL: https://github.com/apache/hadoop/pull/1166#issuecomment-515263195
 
 
   /retest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283034)
Time Spent: 0.5h  (was: 20m)

> Merge HA and Non-HA code in OM
> --
>
> Key: HDDS-1856
> URL: https://issues.apache.org/jira/browse/HDDS-1856
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In this Jira following things will be implemented:
>  # Make the non-HA code path use Cache and DoubleBuffer.
>  # Use OMClientRequest/OMClientResponse classes implemented as part of HA to 
> be used in Non-HA code path.
>  
> Removing of old code will not be done in this Jira, this will be done in 
> further Jiras.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14657) Refine NameSystem lock usage during processing FBR

2019-07-25 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893228#comment-16893228
 ] 

Konstantin Shvachko commented on HDFS-14657:


Hi [~zhangchen].
Looking at your patch v2. I am not sure I understand how your approach works. 
Suppose you release {{namesystem.writeUnlock()}} in {{reportDiffSorted()}}, 
then somebody (unrelated to block report processing) can modify blocks 
belonging to the storage, which will invalidate {{storageBlocksIterator}}, 
meaning calling {{next()}} will cause {{ConcurrentModificationException}}.

> Refine NameSystem lock usage during processing FBR
> --
>
> Key: HDFS-14657
> URL: https://issues.apache.org/jira/browse/HDFS-14657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14657-001.patch, HDFS-14657.002.patch
>
>
> The disk with 12TB capacity is very normal today, which means the FBR size is 
> much larger than before, Namenode holds the NameSystemLock during processing 
> block report for each storage, which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer 
> RPC queue time, which impacts client latency, so we did some simple work on 
> refining the lock usage, which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request 
> it again for every 5000 blocks(by default) during processing FBR, with the 
> fair lock, all the RPC request can be processed before BlockManager 
> re-acquire the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1856) Merge HA and Non-HA code in OM

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1856?focusedWorklogId=283019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283019
 ]

ASF GitHub Bot logged work on HDDS-1856:


Author: ASF GitHub Bot
Created on: 26/Jul/19 00:08
Start Date: 26/Jul/19 00:08
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1166: HDDS-1856. Merge 
HA and Non-HA code in OM.
URL: https://github.com/apache/hadoop/pull/1166#issuecomment-515258588
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 0 | Docker mode activated. |
   | -1 | patch | 13 | https://github.com/apache/hadoop/pull/1166 does not 
apply to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/1166 |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1166/1/console |
   | versions | git=2.7.4 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283019)
Time Spent: 20m  (was: 10m)

> Merge HA and Non-HA code in OM
> --
>
> Key: HDDS-1856
> URL: https://issues.apache.org/jira/browse/HDDS-1856
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In this Jira following things will be implemented:
>  # Make the non-HA code path use Cache and DoubleBuffer.
>  # Use OMClientRequest/OMClientResponse classes implemented as part of HA to 
> be used in Non-HA code path.
>  
> Removing of old code will not be done in this Jira, this will be done in 
> further Jiras.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1833) RefCountedDB printing of stacktrace should be moved to trace logging

2019-07-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893223#comment-16893223
 ] 

Hadoop QA commented on HDDS-1833:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  7m 
38s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 52s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
52s{color} | {color:green} hadoop-hdds in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 45s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
|   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
|   | hadoop.ozone.TestStorageContainerManager |
|   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
|   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
|   | hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures |
|   | hadoop.ozone.om.TestScmSafeMode |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.0 Server=19.03.0 base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2760/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1833 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975896/HDDS-1833.03.patch |
| Optional Tests | dupname

[jira] [Work logged] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1829?focusedWorklogId=283017=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283017
 ]

ASF GitHub Bot logged work on HDDS-1829:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:52
Start Date: 25/Jul/19 23:52
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1164: 
HDDS-1829 On OM reload/restart OmMetrics#numKeys should be updated
URL: https://github.com/apache/hadoop/pull/1164#discussion_r307543280
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/TypedTable.java
 ##
 @@ -205,6 +205,16 @@ public String getName() throws IOException {
 return rawTable.getName();
   }
 
+  @Override
+  public long getEstimatedKeyCount() throws IOException {
+if (rawTable instanceof RDBTable) {
+  return rawTable.getEstimatedKeyCount();
+}
+throw new IllegalArgumentException(
+"Unsupported operation getEstimatedKeyCount() on table type " +
+rawTable.getClass().getCanonicalName());
 
 Review comment:
   I don't think we need a check here, as the underlying table for TypedTable 
is rocksdbTable.
   Even in future, if TypedTable uses new db for rawTable that should support 
getEstimatedKeyCount, if it implements Table.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283017)
Time Spent: 1h 10m  (was: 1h)

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1862) Verify result of exceptional completion of notifyInstallSnapshotFromLeader

2019-07-25 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-1862:

Environment: (was: What happens if the future returned to Ratis from 
{{notifyInstallSnapshotFromLeader}} is completed exceptionally.

The safest option sounds like Ratis should kill the process. Or potentially it 
can retry after a short time.

This jira is to investigate the answer.)

> Verify result of exceptional completion of notifyInstallSnapshotFromLeader
> --
>
> Key: HDDS-1862
> URL: https://issues.apache.org/jira/browse/HDDS-1862
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Arpit Agarwal
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1862) Verify result of exceptional completion of notifyInstallSnapshotFromLeader

2019-07-25 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-1862:

Description: 
What happens if the future returned to Ratis from 
{{notifyInstallSnapshotFromLeader}} is completed exceptionally.

The safest option sounds like Ratis should kill the process. Or potentially it 
can retry after a short time.

This jira is to investigate the answer.

> Verify result of exceptional completion of notifyInstallSnapshotFromLeader
> --
>
> Key: HDDS-1862
> URL: https://issues.apache.org/jira/browse/HDDS-1862
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Arpit Agarwal
>Priority: Major
>
> What happens if the future returned to Ratis from 
> {{notifyInstallSnapshotFromLeader}} is completed exceptionally.
> The safest option sounds like Ratis should kill the process. Or potentially 
> it can retry after a short time.
> This jira is to investigate the answer.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1829?focusedWorklogId=283016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283016
 ]

ASF GitHub Bot logged work on HDDS-1829:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:51
Start Date: 25/Jul/19 23:51
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1164: 
HDDS-1829 On OM reload/restart OmMetrics#numKeys should be updated
URL: https://github.com/apache/hadoop/pull/1164#discussion_r307543280
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/TypedTable.java
 ##
 @@ -205,6 +205,16 @@ public String getName() throws IOException {
 return rawTable.getName();
   }
 
+  @Override
+  public long getEstimatedKeyCount() throws IOException {
+if (rawTable instanceof RDBTable) {
+  return rawTable.getEstimatedKeyCount();
+}
+throw new IllegalArgumentException(
+"Unsupported operation getEstimatedKeyCount() on table type " +
+rawTable.getClass().getCanonicalName());
 
 Review comment:
   I don't think we need a check here, as the underlying table for TypedTable 
is rocksdbTable.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283016)
Time Spent: 1h  (was: 50m)

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1862) Verify result of exceptional completion of notifyInstallSnapshotFromLeader

2019-07-25 Thread Arpit Agarwal (JIRA)

Arpit Agarwal created HDDS-1862:
---

 Summary: Verify result of exceptional completion of 
notifyInstallSnapshotFromLeader
 Key: HDDS-1862
 URL: https://issues.apache.org/jira/browse/HDDS-1862
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
 Environment: What happens if the future returned to Ratis from 
{{notifyInstallSnapshotFromLeader}} is completed exceptionally.

The safest option sounds like Ratis should kill the process. Or potentially it 
can retry after a short time.

This jira is to investigate the answer.
Reporter: Arpit Agarwal






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1829?focusedWorklogId=283007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283007
 ]

ASF GitHub Bot logged work on HDDS-1829:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:31
Start Date: 25/Jul/19 23:31
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1164: HDDS-1829 On OM 
reload/restart OmMetrics#numKeys should be updated
URL: https://github.com/apache/hadoop/pull/1164#discussion_r307539625
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/RDBTable.java
 ##
 @@ -183,4 +183,14 @@ public String getName() throws IOException {
   public void close() throws Exception {
 // Nothing do for a Column Family.
   }
+
+  @Override
+  public long getEstimatedKeyCount() throws IOException {
+try {
+  return db.getLongProperty(handle, "rocksdb.estimate-num-keys");
+} catch (RocksDBException e) {
+  throw new IOException(
+  "Failed to get estimated key count of table.");
 
 Review comment:
   If we throw an exception here, Ratis will get a failure via the future 
returned from `notifyInstallSnapshotFromLeader`. I am not sure what the impact 
of that is. Will Ratis terminate the process? cc @hanishakoneru 
   
   I'd say for now this is okay. I filed 
https://issues.apache.org/jira/browse/HDDS-1862 to investigate further.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283007)
Time Spent: 40m  (was: 0.5h)

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893218#comment-16893218
 ] 

Hudson commented on HDDS-1830:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16985 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16985/])
HDDS-1830 OzoneManagerDoubleBuffer#stop should wait for daemon thread to (arp7: 
rev b7fba78fb63a0971835db87292822fd8cd4aa7ad)
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java


> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1850) ReplicationManager should consider inflight replication and deletion while picking datanode for re-replication

2019-07-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893217#comment-16893217
 ] 

Hudson commented on HDDS-1850:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16985 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16985/])
HDDS-1850. ReplicationManager should consider inflight replication and 
(aengineer: rev 2b1d8aedbb669cf412465bf7a5762c8aeda52faa)
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java


> ReplicationManager should consider inflight replication and deletion while 
> picking datanode for re-replication
> --
>
> Key: HDDS-1850
> URL: https://issues.apache.org/jira/browse/HDDS-1850
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: blockade, pull-request-available
> Fix For: 0.5.0, 0.4.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When choosing the target datanode for re-replication {{ReplicationManager}} 
> should consider the datanodes which are in inflight replication and deletion 
> for the same container.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1829?focusedWorklogId=283009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283009
 ]

ASF GitHub Bot logged work on HDDS-1829:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:36
Start Date: 25/Jul/19 23:36
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1164: 
HDDS-1829 On OM reload/restart OmMetrics#numKeys should be updated
URL: https://github.com/apache/hadoop/pull/1164#discussion_r307540522
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/RDBTable.java
 ##
 @@ -183,4 +183,14 @@ public String getName() throws IOException {
   public void close() throws Exception {
 // Nothing do for a Column Family.
   }
+
+  @Override
+  public long getEstimatedKeyCount() throws IOException {
+try {
+  return db.getLongProperty(handle, "rocksdb.estimate-num-keys");
+} catch (RocksDBException e) {
+  throw new IOException(
+  "Failed to get estimated key count of the table.");
 
 Review comment:
   Minor Nit: Can we add the table name here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283009)
Time Spent: 50m  (was: 40m)

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283005
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:29
Start Date: 25/Jul/19 23:29
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307539277
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerCheck.java
 ##
 @@ -120,10 +132,70 @@ public TestKeyValueContainerCheck(String metadataImpl) {
 container.close();
 
 // next run checks on a Closed Container
-valid = kvCheck.fullCheck();
+valid = kvCheck.fullCheck(new DataTransferThrottler(
+HddsConfigKeys.HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT),
+null);
 assertTrue(valid);
   }
 
+  /**
+   * Sanity test, when there are corruptions induced.
+   * @throws Exception
+   */
+  @Test
+  public void testKeyValueContainerCheckCorruption() throws Exception {
 
 Review comment:
   If you add a label called ozone, then this pull request will be processed by 
Jenkins. Just FYI. I have done that for this patch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283005)
Time Spent: 1h 50m  (was: 1h 40m)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283004
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:29
Start Date: 25/Jul/19 23:29
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307539083
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerCheck.java
 ##
 @@ -120,10 +132,70 @@ public TestKeyValueContainerCheck(String metadataImpl) {
 container.close();
 
 // next run checks on a Closed Container
-valid = kvCheck.fullCheck();
+valid = kvCheck.fullCheck(new DataTransferThrottler(
+HddsConfigKeys.HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT),
+null);
 assertTrue(valid);
   }
 
+  /**
+   * Sanity test, when there are corruptions induced.
+   * @throws Exception
+   */
+  @Test
+  public void testKeyValueContainerCheckCorruption() throws Exception {
 
 Review comment:
   /label ozone
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283004)
Time Spent: 1h 40m  (was: 1.5h)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=282998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282998
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307538835
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerCheck.java
 ##
 @@ -120,10 +132,70 @@ public TestKeyValueContainerCheck(String metadataImpl) {
 container.close();
 
 // next run checks on a Closed Container
-valid = kvCheck.fullCheck();
+valid = kvCheck.fullCheck(new DataTransferThrottler(
+HddsConfigKeys.HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT),
+null);
 assertTrue(valid);
   }
 
+  /**
+   * Sanity test, when there are corruptions induced.
+   * @throws Exception
+   */
+  @Test
+  public void testKeyValueContainerCheckCorruption() throws Exception {
 
 Review comment:
   you can run this test under memory profiler to make sure there are no leaks. 
Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282998)
Time Spent: 50m  (was: 40m)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283000
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307538039
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
 ##
 @@ -220,43 +229,66 @@ private void checkBlockDB() throws IOException {
   throw new IOException(dbFileErrorMsg);
 }
 
-
 onDiskContainerData.setDbFile(dbFile);
 try(ReferenceCountedDB db =
-BlockUtils.getDB(onDiskContainerData, checkConfig)) {
-  iterateBlockDB(db);
-}
-  }
+BlockUtils.getDB(onDiskContainerData, checkConfig);
+KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
+new File(onDiskContainerData.getContainerPath( {
 
-  private void iterateBlockDB(ReferenceCountedDB db)
-  throws IOException {
-Preconditions.checkState(db != null);
-
-// get "normal" keys from the Block DB
-try(KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
-new File(onDiskContainerData.getContainerPath( {
-
-  // ensure there is a chunk file for each key in the DB
-  while (kvIter.hasNext()) {
+  while(kvIter.hasNext()) {
 BlockData block = kvIter.nextBlock();
-
-List chunkInfoList = block.getChunks();
-for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
-  File chunkFile;
-  chunkFile = ChunkUtils.getChunkFile(onDiskContainerData,
+for(ContainerProtos.ChunkInfo chunk : block.getChunks()) {
+  File chunkFile = ChunkUtils.getChunkFile(onDiskContainerData,
   ChunkInfo.getFromProtoBuf(chunk));
-
   if (!chunkFile.exists()) {
 // concurrent mutation in Block DB? lookup the block again.
 byte[] bdata = db.getStore().get(
 Longs.toByteArray(block.getBlockID().getLocalID()));
-if (bdata == null) {
-  LOG.trace("concurrency with delete, ignoring deleted block");
-  break; // skip to next block from kvIter
-} else {
-  String errorStr = "Missing chunk file "
-  + chunkFile.getAbsolutePath();
-  throw new IOException(errorStr);
+if (bdata != null) {
+  throw new IOException("Missing chunk file "
+  + chunkFile.getAbsolutePath());
+}
+  } else if (chunk.getChecksumData().getType()
+  != ContainerProtos.ChecksumType.NONE){
 
 Review comment:
   Also what happens if CheckSum type is none ? we silently skip? That makes 
sense, just asking to make sure that my understanding is correct.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283000)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283001
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307538592
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerDataScanner.java
 ##
 @@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.ozoneimpl;
+
+import java.io.IOException;
+import java.util.Iterator;
+
+import org.apache.hadoop.hdfs.util.Canceler;
+import org.apache.hadoop.hdfs.util.DataTransferThrottler;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import org.apache.hadoop.ozone.container.common.volume.HddsVolume;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * VolumeScanner scans a single volume.  Each VolumeScanner has its own thread.
+ * They are all managed by the DataNode's BlockScanner.
+ */
+public class ContainerDataScanner extends Thread {
+  public static final Logger LOG =
+  LoggerFactory.getLogger(ContainerDataScanner.class);
+
+  /**
+   * The volume that we're scanning.
+   */
+  private final HddsVolume volume;
+  private final ContainerController controller;
+  private final DataTransferThrottler throttler;
+  private final Canceler canceler;
+
+  /**
+   * True if the thread is stopping.
+   * Protected by this object's lock.
+   */
+  private volatile boolean stopping = false;
+
+
+  public ContainerDataScanner(ContainerController controller,
+  HddsVolume volume, long bytesPerSec) {
+this.controller = controller;
+this.volume = volume;
+this.throttler = new DataTransferThrottler(bytesPerSec);
+this.canceler = new Canceler();
+setName("ContainerDataScanner(" + volume + ")");
+setDaemon(true);
+  }
+
+  @Override
+  public void run() {
+LOG.trace("{}: thread starting.", this);
+try {
+  while (!stopping) {
+Iterator itr = controller.getContainers(volume);
+while (!stopping && itr.hasNext()) {
+  Container c = itr.next();
+  try {
+if (c.shouldScanData()) {
+  if(!c.scanData(throttler, canceler)) {
+controller.markContainerUnhealthy(
+c.getContainerData().getContainerID());
+  }
+}
+  } catch (IOException ex) {
+long containerId = c.getContainerData().getContainerID();
+LOG.warn("Unexpected exception while scanning container "
++ containerId, ex);
 
 Review comment:
   If we are not able to read the container, should we mark the container as 
unhealthy ? even if we got an exception ? I am not sure if all exceptions do 
mean contianer is unhealthy, but for some exceptions; yes it is unhealthy.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283001)
Time Spent: 1h 10m  (was: 1h)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283003
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307537516
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
 ##
 @@ -220,43 +229,66 @@ private void checkBlockDB() throws IOException {
   throw new IOException(dbFileErrorMsg);
 }
 
-
 onDiskContainerData.setDbFile(dbFile);
 try(ReferenceCountedDB db =
-BlockUtils.getDB(onDiskContainerData, checkConfig)) {
-  iterateBlockDB(db);
-}
-  }
+BlockUtils.getDB(onDiskContainerData, checkConfig);
+KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
+new File(onDiskContainerData.getContainerPath( {
 
 Review comment:
   Can you please run this in a profiler mode -- and make sure there are no 
memory leaks in this code path. Nothing to do with your patch at all. Just that 
we have found some issues here earlier. Just run it under something like 
VisualVM and see if we release all memory when get out of the loop.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283003)
Time Spent: 1.5h  (was: 1h 20m)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=283002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-283002
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307536162
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsConfigKeys.java
 ##
 @@ -68,11 +68,16 @@
   public static final String HDDS_CONTAINERSCRUB_ENABLED =
   "hdds.containerscrub.enabled";
   public static final boolean HDDS_CONTAINERSCRUB_ENABLED_DEFAULT = false;
+
   public static final boolean HDDS_SCM_SAFEMODE_ENABLED_DEFAULT = true;
   public static final String HDDS_SCM_SAFEMODE_MIN_DATANODE =
   "hdds.scm.safemode.min.datanode";
   public static final int HDDS_SCM_SAFEMODE_MIN_DATANODE_DEFAULT = 1;
 
+  public static final String HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND =
+  "hdds.container.scanner.volume.bytes.per.second";
+  public static final long
+  HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT = 1048576L;
 
 
 Review comment:
   1048576L , ozone supports writing things like 1MB. if that is useful.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 283002)
Time Spent: 1h 20m  (was: 1h 10m)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=282999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282999
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:28
Start Date: 25/Jul/19 23:28
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307537762
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
 ##
 @@ -220,43 +229,66 @@ private void checkBlockDB() throws IOException {
   throw new IOException(dbFileErrorMsg);
 }
 
-
 onDiskContainerData.setDbFile(dbFile);
 try(ReferenceCountedDB db =
-BlockUtils.getDB(onDiskContainerData, checkConfig)) {
-  iterateBlockDB(db);
-}
-  }
+BlockUtils.getDB(onDiskContainerData, checkConfig);
+KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
+new File(onDiskContainerData.getContainerPath( {
 
-  private void iterateBlockDB(ReferenceCountedDB db)
-  throws IOException {
-Preconditions.checkState(db != null);
-
-// get "normal" keys from the Block DB
-try(KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
-new File(onDiskContainerData.getContainerPath( {
-
-  // ensure there is a chunk file for each key in the DB
-  while (kvIter.hasNext()) {
+  while(kvIter.hasNext()) {
 BlockData block = kvIter.nextBlock();
-
-List chunkInfoList = block.getChunks();
-for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
-  File chunkFile;
-  chunkFile = ChunkUtils.getChunkFile(onDiskContainerData,
+for(ContainerProtos.ChunkInfo chunk : block.getChunks()) {
+  File chunkFile = ChunkUtils.getChunkFile(onDiskContainerData,
   ChunkInfo.getFromProtoBuf(chunk));
-
   if (!chunkFile.exists()) {
 // concurrent mutation in Block DB? lookup the block again.
 byte[] bdata = db.getStore().get(
 Longs.toByteArray(block.getBlockID().getLocalID()));
-if (bdata == null) {
-  LOG.trace("concurrency with delete, ignoring deleted block");
-  break; // skip to next block from kvIter
-} else {
-  String errorStr = "Missing chunk file "
-  + chunkFile.getAbsolutePath();
-  throw new IOException(errorStr);
+if (bdata != null) {
+  throw new IOException("Missing chunk file "
+  + chunkFile.getAbsolutePath());
+}
+  } else if (chunk.getChecksumData().getType()
+  != ContainerProtos.ChecksumType.NONE){
 
 Review comment:
   Care to break this if part into a function ? I am ok if it is not possible 
or too much work, Do it only if it is easy for you.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282999)
Time Spent: 1h  (was: 50m)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-1618) Merge code for HA and Non-HA OM requests for bucket

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-1618.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

All OM requests like bucket/file/key/volume/s3 requests merging with HA code is 
taken care as part of HDDS-1856.

> Merge code for HA and Non-HA OM requests for bucket
> ---
>
> Key: HDDS-1618
> URL: https://issues.apache.org/jira/browse/HDDS-1618
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.5.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In this Jira, we shall use the new code added in HDDS-1551 for Non-HA flow.
>  
> This Jira modifies the bucket requests only, further requests will be handled 
> in further Jira's.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1856) Merge HA and Non-HA code in OM

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1856:
-
Labels: pull-request-available  (was: )

> Merge HA and Non-HA code in OM
> --
>
> Key: HDDS-1856
> URL: https://issues.apache.org/jira/browse/HDDS-1856
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> In this Jira following things will be implemented:
>  # Make the non-HA code path use Cache and DoubleBuffer.
>  # Use OMClientRequest/OMClientResponse classes implemented as part of HA to 
> be used in Non-HA code path.
>  
> Removing of old code will not be done in this Jira, this will be done in 
> further Jiras.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1856) Merge HA and Non-HA code in OM

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1856?focusedWorklogId=282996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282996
 ]

ASF GitHub Bot logged work on HDDS-1856:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:23
Start Date: 25/Jul/19 23:23
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1166: 
HDDS-1856. Merge HA and Non-HA code in OM.
URL: https://github.com/apache/hadoop/pull/1166
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282996)
Time Spent: 10m
Remaining Estimate: 0h

> Merge HA and Non-HA code in OM
> --
>
> Key: HDDS-1856
> URL: https://issues.apache.org/jira/browse/HDDS-1856
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In this Jira following things will be implemented:
>  # Make the non-HA code path use Cache and DoubleBuffer.
>  # Use OMClientRequest/OMClientResponse classes implemented as part of HA to 
> be used in Non-HA code path.
>  
> Removing of old code will not be done in this Jira, this will be done in 
> further Jiras.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1833) RefCountedDB printing of stacktrace should be moved to trace logging

2019-07-25 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893208#comment-16893208
 ] 

Eric Yang commented on HDDS-1833:
-

[~swagle] Thank you for the patch.  Generating the full stack may take more 
compute cycles.  If this is a frequently called API, I would recommend to keep 
the if statement to perform stack trace computation only when trace is turned 
on.  If this is not a frequently called API, the patch 3 looks good to me.

> RefCountedDB printing of stacktrace should be moved to trace logging
> 
>
> Key: HDDS-1833
> URL: https://issues.apache.org/jira/browse/HDDS-1833
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-1833.01.patch, HDDS-1833.02.patch, 
> HDDS-1833.03.patch
>
>
> RefCountedDB logs the stackTrace for both increment and decrement, this 
> pollutes the logs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1830?focusedWorklogId=282995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282995
 ]

ASF GitHub Bot logged work on HDDS-1830:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:20
Start Date: 25/Jul/19 23:20
Worklog Time Spent: 10m 
  Work Description: arp7 commented on issue #1156: HDDS-1830 
OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
URL: https://github.com/apache/hadoop/pull/1156#issuecomment-515249526
 
 
   I am not sure how revert will work with the PR. Will leave it alone for now. 
Let's see if subsequent CI runs uncover any issue with this change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282995)
Time Spent: 1h 20m  (was: 1h 10m)

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1830?focusedWorklogId=282993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282993
 ]

ASF GitHub Bot logged work on HDDS-1830:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:18
Start Date: 25/Jul/19 23:18
Worklog Time Spent: 10m 
  Work Description: arp7 commented on issue #1156: HDDS-1830 
OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
URL: https://github.com/apache/hadoop/pull/1156#issuecomment-515248940
 
 
   I merged this before CI had a chance to complete. Sorry about that. Let me 
revert and retest.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282993)
Time Spent: 1h 10m  (was: 1h)

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1391) Add ability in OM to serve delta updates through an API.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1391?focusedWorklogId=282991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282991
 ]

ASF GitHub Bot logged work on HDDS-1391:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:17
Start Date: 25/Jul/19 23:17
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #1033: HDDS-1391 
: Add ability in OM to serve delta updates through an API.
URL: https://github.com/apache/hadoop/pull/1033#discussion_r307536825
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/RDBStore.java
 ##
 @@ -327,6 +329,44 @@ public CodecRegistry getCodecRegistry() {
 return codecRegistry;
   }
 
+  @Override
+  public DBUpdatesWrapper getUpdatesSince(long sequenceNumber)
+  throws SequenceNumberNotFoundException {
+
+DBUpdatesWrapper dbUpdatesWrapper = new DBUpdatesWrapper();
+try {
+  TransactionLogIterator transactionLogIterator =
+  db.getUpdatesSince(sequenceNumber);
+
+  boolean flag = true;
+
+  while (transactionLogIterator.isValid()) {
+TransactionLogIterator.BatchResult result =
+transactionLogIterator.getBatch();
+long currSequenceNumber = result.sequenceNumber();
+if (flag && currSequenceNumber > 1 + sequenceNumber) {
+  throw new SequenceNumberNotFoundException("Unable to read data from" 
+
+  " RocksDB wal to get delta updates. It may have already been" +
+  "flushed to SSTs.");
+}
+flag = false;
 
 Review comment:
   Explained and renamed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282991)
Time Spent: 4h 20m  (was: 4h 10m)

> Add ability in OM to serve delta updates through an API.
> 
>
> Key: HDDS-1391
> URL: https://issues.apache.org/jira/browse/HDDS-1391
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Added an RPC end point to serve the set of updates in OM RocksDB from a given 
> sequence number.
> This will be used by Recon (HDDS-1105) to push the data to all the tasks that 
> will keep their aggregate data up to date. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1391) Add ability in OM to serve delta updates through an API.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1391?focusedWorklogId=282990=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282990
 ]

ASF GitHub Bot logged work on HDDS-1391:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:17
Start Date: 25/Jul/19 23:17
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #1033: HDDS-1391 
: Add ability in OM to serve delta updates through an API.
URL: https://github.com/apache/hadoop/pull/1033#discussion_r307536807
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/RDBStore.java
 ##
 @@ -327,6 +329,44 @@ public CodecRegistry getCodecRegistry() {
 return codecRegistry;
   }
 
+  @Override
+  public DBUpdatesWrapper getUpdatesSince(long sequenceNumber)
+  throws SequenceNumberNotFoundException {
+
+DBUpdatesWrapper dbUpdatesWrapper = new DBUpdatesWrapper();
+try {
+  TransactionLogIterator transactionLogIterator =
+  db.getUpdatesSince(sequenceNumber);
+
+  boolean flag = true;
+
+  while (transactionLogIterator.isValid()) {
+TransactionLogIterator.BatchResult result =
+transactionLogIterator.getBatch();
+long currSequenceNumber = result.sequenceNumber();
+if (flag && currSequenceNumber > 1 + sequenceNumber) {
+  throw new SequenceNumberNotFoundException("Unable to read data from" 
+
+  " RocksDB wal to get delta updates. It may have already been" +
+  "flushed to SSTs.");
+}
+flag = false;
+if (currSequenceNumber == sequenceNumber) {
+  transactionLogIterator.next();
+  continue;
+}
+WriteBatch writeBatch = result.writeBatch();
+byte[] writeBatchData = writeBatch.data();
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282990)
Time Spent: 4h 10m  (was: 4h)

> Add ability in OM to serve delta updates through an API.
> 
>
> Key: HDDS-1391
> URL: https://issues.apache.org/jira/browse/HDDS-1391
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Added an RPC end point to serve the set of updates in OM RocksDB from a given 
> sequence number.
> This will be used by Recon (HDDS-1105) to push the data to all the tasks that 
> will keep their aggregate data up to date. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1830.
-
   Resolution: Fixed
Fix Version/s: 0.5.0

+1

Merged via GitHub. Thanks for the contribution [~smeng] and thanks for the 
review [~bharatviswa]!

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1830?focusedWorklogId=282988=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282988
 ]

ASF GitHub Bot logged work on HDDS-1830:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:14
Start Date: 25/Jul/19 23:14
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1156: HDDS-1830 
OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
URL: https://github.com/apache/hadoop/pull/1156
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282988)
Time Spent: 1h  (was: 50m)

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1391) Add ability in OM to serve delta updates through an API.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1391?focusedWorklogId=282987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282987
 ]

ASF GitHub Bot logged work on HDDS-1391:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:13
Start Date: 25/Jul/19 23:13
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #1033: HDDS-1391 
: Add ability in OM to serve delta updates through an API.
URL: https://github.com/apache/hadoop/pull/1033#discussion_r307536091
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManager.java
 ##
 @@ -1395,8 +1397,41 @@ public void testDBKeyMayExist() throws Exception {
 RDBStore rdbStore = (RDBStore) cluster.getOzoneManager()
 .getMetadataManager().getStore();
 RocksDB db = rdbStore.getDb();
-UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
 
+OmKeyInfo keyInfo = getNewOmKeyInfo();
+OmKeyInfoCodec omKeyInfoCodec = new OmKeyInfoCodec();
+
+db.put(StringUtils.getBytesUtf16("OMKey1"),
+omKeyInfoCodec.toPersistedFormat(keyInfo));
+
+StringBuilder sb = new StringBuilder();
+Assert.assertTrue(db.keyMayExist(StringUtils.getBytesUtf16("OMKey1"),
+sb));
+Assert.assertTrue(sb.length() > 0);
+  }
+
+
+  @Test
+  public void testGetOMDBUpdates() throws IOException {
+
+DBUpdatesRequest dbUpdatesRequest =
+DBUpdatesRequest.newBuilder().setSequenceNumber(0).build();
+
+DBUpdatesWrapper dbUpdates =
+cluster.getOzoneManager().getDBUpdates(dbUpdatesRequest);
+Assert.assertTrue(dbUpdates.getData().isEmpty());
+
+//Write data to OM.
+OmKeyInfo keyInfo = getNewOmKeyInfo();
+Assert.assertNotNull(keyInfo);
+dbUpdates =
+cluster.getOzoneManager().getDBUpdates(dbUpdatesRequest);
+Assert.assertFalse(dbUpdates.getData().isEmpty());
 
 Review comment:
   Yes, that will be difficult. The only way to do is that is to write a custom 
Iterator handler for RocksDB which we have done in 
org.apache.hadoop.ozone.recon.tasks.OMDBUpdatesHandler. The tests in 
org.apache.hadoop.ozone.recon.tasks.TestOMDBUpdatesHandler cover the data 
correctness use cases.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282987)
Time Spent: 4h  (was: 3h 50m)

> Add ability in OM to serve delta updates through an API.
> 
>
> Key: HDDS-1391
> URL: https://issues.apache.org/jira/browse/HDDS-1391
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Added an RPC end point to serve the set of updates in OM RocksDB from a given 
> sequence number.
> This will be used by Recon (HDDS-1105) to push the data to all the tasks that 
> will keep their aggregate data up to date. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1200?focusedWorklogId=282985=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282985
 ]

ASF GitHub Bot logged work on HDDS-1200:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:12
Start Date: 25/Jul/19 23:12
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1154: 
[HDDS-1200] Add support for checksum verification in data scrubber
URL: https://github.com/apache/hadoop/pull/1154#discussion_r307535953
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsConfigKeys.java
 ##
 @@ -68,11 +68,16 @@
   public static final String HDDS_CONTAINERSCRUB_ENABLED =
   "hdds.containerscrub.enabled";
   public static final boolean HDDS_CONTAINERSCRUB_ENABLED_DEFAULT = false;
+
   public static final boolean HDDS_SCM_SAFEMODE_ENABLED_DEFAULT = true;
   public static final String HDDS_SCM_SAFEMODE_MIN_DATANODE =
   "hdds.scm.safemode.min.datanode";
   public static final int HDDS_SCM_SAFEMODE_MIN_DATANODE_DEFAULT = 1;
 
+  public static final String HDDS_CONTAINER_SCANNER_VOLUME_BYTES_PER_SECOND =
+  "hdds.container.scanner.volume.bytes.per.second";
 
 Review comment:
   Sorry my standard Ozone Comment: Can we please use the configuration based 
API for these changes.
   
https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282985)
Time Spent: 40m  (was: 0.5h)

> Ozone Data Scrubbing : Checksum verification for chunks
> ---
>
> Key: HDDS-1200
> URL: https://issues.apache.org/jira/browse/HDDS-1200
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Background scrubber should read each chunk and verify the checksum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1495) Create hadoop/ozone docker images with inline build process

2019-07-25 Thread Anu Engineer (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1495:
---
Status: Open  (was: Patch Available)

> Create hadoop/ozone docker images with inline build process
> ---
>
> Key: HDDS-1495
> URL: https://issues.apache.org/jira/browse/HDDS-1495
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Eric Yang
>Priority: Major
> Attachments: HADOOP-16091.001.patch, HADOOP-16091.002.patch, 
> HDDS-1495.003.patch, HDDS-1495.004.patch, HDDS-1495.005.patch, 
> HDDS-1495.006.patch, HDDS-1495.007.patch, HDDS-1495.008.patch, Hadoop Docker 
> Image inline build process.pdf
>
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making docker build inline if possible.
> {quote}
> The main challenges are also discussed in the thread:
> {code:java}
> 3. Technically it would be possible to add the Dockerfile to the source
> tree and publish the docker image together with the release by the
> release manager but it's also problematic:
> {code}
> a) there is no easy way to stage the images for the vote
>  c) it couldn't be flagged as automated on dockerhub
>  d) It couldn't support the critical updates.
>  * Updating existing images (for example in case of an ssl bug, rebuild
>  all the existing images with exactly the same payload but updated base
>  image/os environment)
>  * Creating image for older releases (We would like to provide images,
>  for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
>  with different versions).
> {code:java}
>  {code}
> The a) can be solved (as [~eyang] suggested) with using a personal docker 
> image during the vote and publish it to the dockerhub after the vote (in case 
> the permission can be set by the INFRA)
> Note: based on LEGAL-270 and linked discussion both approaches (inline build 
> process / external build process) are compatible with the apache release.
> Note: HDDS-851 and HADOOP-14898 contains more information about these 
> problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14671) WebHDFS: Add erasureCodingPolicy to ContentSummary

2019-07-25 Thread Chao Sun (JIRA)

Chao Sun created HDFS-14671:
---

 Summary: WebHDFS: Add erasureCodingPolicy to ContentSummary
 Key: HDFS-14671
 URL: https://issues.apache.org/jira/browse/HDFS-14671
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: webhdfs
Reporter: Chao Sun
Assignee: Chao Sun


HDFS-11647 added {{erasureCodingPolicy}} to {{ContentSummary}}. We should add 
this info to the result from WebHDFS {{getContentSummary}} call as well.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1391) Add ability in OM to serve delta updates through an API.

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1391?focusedWorklogId=282981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282981
 ]

ASF GitHub Bot logged work on HDDS-1391:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:08
Start Date: 25/Jul/19 23:08
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #1033: HDDS-1391 
: Add ability in OM to serve delta updates through an API.
URL: https://github.com/apache/hadoop/pull/1033#discussion_r307535001
 
 

 ##
 File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java
 ##
 @@ -199,6 +199,7 @@ public static boolean isReadOnly(
 case LookupFile:
 case ListStatus:
 case GetAcl:
+case DBUpdates:
 
 Review comment:
   Yes, for now this is OK. Handling of OM HA in recon will be done later. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282981)
Time Spent: 3h 50m  (was: 3h 40m)

> Add ability in OM to serve delta updates through an API.
> 
>
> Key: HDDS-1391
> URL: https://issues.apache.org/jira/browse/HDDS-1391
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Added an RPC end point to serve the set of updates in OM RocksDB from a given 
> sequence number.
> This will be used by Recon (HDDS-1105) to push the data to all the tasks that 
> will keep their aggregate data up to date. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1734) Use maven assembly to create ozone tarball image

2019-07-25 Thread Anu Engineer (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1734:
---
Status: Open  (was: Patch Available)

> Use maven assembly to create ozone tarball image
> 
>
> Key: HDDS-1734
> URL: https://issues.apache.org/jira/browse/HDDS-1734
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: HDDS-1734.001.patch, HDDS-1734.002.patch, 
> HDDS-1734.003.patch
>
>
> Ozone is using tar stitching to create ozone tarball.  This prevents down 
> stream project to use Ozone tarball as a dependency.  It would be nice to 
> create Ozone tarball with maven assembly plugin to have ability to cache 
> ozone tarball in maven repository.  This ability allows docker build to be a 
> separate sub-module and referencing to Ozone tarball.  This change can help 
> docker development to be more agile without making a full project build.
> Test procedure:
> {code:java}
> mvn -f pom.ozone.xml clean install -DskipTests -DskipShade 
> -Dmaven.javadoc.skip -Pdist{code}
> Expected result:
> This will install tarball into:
> {code:java}
> ~/.m2/repository/org/apache/hadoop/hadoop-ozone-dist/0.5.0-SNAPSHOT/hadoop-ozone-dist-0.5.0-SNAPSHOT.tar.gz{code}
> Test procedure 2:
> {code:java}
> mvn -f pom.ozone.xml clean package -DskipTests -DskipShade 
> -Dmaven.javadoc.skip -Pdist{code}
>  
> Expected result:
> hadoop/hadoop-ozone/dist/target directory contains 
> ozone-0.5.0-SNAPSHOT.tar.gz file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1861:
-
Labels: pull-request-available  (was: )

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1861:
-
Status: Patch Available  (was: Open)

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?focusedWorklogId=282982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282982
 ]

ASF GitHub Bot logged work on HDDS-1861:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:08
Start Date: 25/Jul/19 23:08
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #1165: 
HDDS-1861. Fix TableCacheImpl cleanup logic.
URL: https://github.com/apache/hadoop/pull/1165
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282982)
Time Spent: 10m
Remaining Estimate: 0h

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1861:
-
Target Version/s: 0.5.0

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1861) Fix TableCacheImpl cleanup logic

2019-07-25 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1861:
-
Summary: Fix TableCacheImpl cleanup logic  (was: Fix TableCacheImpl logic)

> Fix TableCacheImpl cleanup logic
> 
>
> Key: HDDS-1861
> URL: https://issues.apache.org/jira/browse/HDDS-1861
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Currently in cleanup, we iterate over epochEntries and cleaup the entries 
> from cache and epochEntries set.
>  
> epochEntries is a TreeSet<> which is not a concurrent datastructure of java. 
> We may see issue some times, when cleanup tries to remove entries and some 
> other thread tries to add entries to cache. So, we need to use some 
> concurrent set over there.
>  
> During cluster testing, seen this some times randomly:
>  
> {code:java}
> 019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9862, call Call#8974 Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 10.65.15.233:35222 java.lang.NullPointerException at 
> java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
> java.util.TreeMap.put(TreeMap.java:582) at 
> java.util.TreeSet.add(TreeSet.java:255) at 
> org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) 
> at org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) 
> at 
> org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
> java.security.AccessController.doPrivileged(Native Method){code}
>  
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-25 Thread Chao Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893196#comment-16893196
 ] 

Chao Sun commented on HDFS-14034:
-

Thanks [~xkrogen] for the comments! attached patch v4 to address them. 

[~jojochuang]: it would be great if you can also take a look. Thanks!

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-25 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-14034:

Attachment: HDFS-14034.004.patch

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1830?focusedWorklogId=282980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282980
 ]

ASF GitHub Bot logged work on HDDS-1830:


Author: ASF GitHub Bot
Created on: 25/Jul/19 23:04
Start Date: 25/Jul/19 23:04
Worklog Time Spent: 10m 
  Work Description: smengcl commented on pull request #1156: HDDS-1830 
OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
URL: https://github.com/apache/hadoop/pull/1156#discussion_r307534380
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
 ##
 @@ -64,7 +65,7 @@
   private final OMMetadataManager omMetadataManager;
   private final AtomicLong flushedTransactionCount = new AtomicLong(0);
   private final AtomicLong flushIterations = new AtomicLong(0);
-  private volatile boolean isRunning;
+  private final AtomicBoolean isRunning = new AtomicBoolean(true);
 
 Review comment:
   Thanks for pointing this out. Resolved.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282980)
Time Spent: 50m  (was: 40m)

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1861) Fix TableCacheImpl logic

2019-07-25 Thread Bharat Viswanadham (JIRA)

Bharat Viswanadham created HDDS-1861:


 Summary: Fix TableCacheImpl logic
 Key: HDDS-1861
 URL: https://issues.apache.org/jira/browse/HDDS-1861
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


Currently in cleanup, we iterate over epochEntries and cleaup the entries from 
cache and epochEntries set.

 

epochEntries is a TreeSet<> which is not a concurrent datastructure of java. We 
may see issue some times, when cleanup tries to remove entries and some other 
thread tries to add entries to cache. So, we need to use some concurrent set 
over there.

 

During cluster testing, seen this some times randomly:
 
{code:java}
019-07-25 15:28:41,087 WARN org.apache.hadoop.ipc.Server: IPC Server handler 5 
on 9862, call Call#8974 Retry#0 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
10.65.15.233:35222 java.lang.NullPointerException at 
java.util.TreeMap.fixAfterInsertion(TreeMap.java:2295) at 
java.util.TreeMap.put(TreeMap.java:582) at 
java.util.TreeSet.add(TreeSet.java:255) at 
org.apache.hadoop.utils.db.cache.TableCacheImpl.put(TableCacheImpl.java:75) at 
org.apache.hadoop.utils.db.TypedTable.addCacheEntry(TypedTable.java:218) at 
org.apache.hadoop.ozone.om.request.key.OMKeyRequest.prepareCreateKeyResponse(OMKeyRequest.java:292)
 at 
org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:188)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method){code}
 
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-07-25 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1829?focusedWorklogId=282973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282973
 ]

ASF GitHub Bot logged work on HDDS-1829:


Author: ASF GitHub Bot
Created on: 25/Jul/19 22:50
Start Date: 25/Jul/19 22:50
Worklog Time Spent: 10m 
  Work Description: smengcl commented on pull request #1164: HDDS-1829 On 
OM reload/restart OmMetrics#numKeys should be updated
URL: https://github.com/apache/hadoop/pull/1164#discussion_r307531341
 
 

 ##
 File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/utils/db/TypedTable.java
 ##
 @@ -205,6 +205,16 @@ public String getName() throws IOException {
 return rawTable.getName();
   }
 
+  @Override
+  public long getEstimatedKeyCount() throws IOException {
+if (rawTable instanceof RDBTable) {
+  return rawTable.getEstimatedKeyCount();
+}
+throw new IllegalArgumentException(
+"Unsupported operation getEstimatedKeyCount() on table type " +
+rawTable.getClass().getCanonicalName());
 
 Review comment:
   Does IllegalArgumentException look good here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 282973)
Time Spent: 0.5h  (was: 20m)

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 3 >

1 - 100 of 234 matches

Mail list logo