[jira] [Commented] (HDFS-13674) Improve documentation on Metrics

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533248#comment-16533248
 ] 

Yiqun Lin commented on HDFS-13674:
--

Hi [~csun], I don't think it's a improvement to separate quantile metric from 
one to five. Since they are almost same, so we combine them in one line. It 
should be okay I think.
{noformat}
-| `EditLogFetchTime`*num*`s(50/75/90/95/99)thPercentileLatency` | The 
50/75/90/95/99th percentile of time spent in fetching edit streams from journal 
nodes by standby NameNode, in milliseconds. Percentile measurement is off by 
default, by watching no intervals. The intervals are specified by 
`dfs.metrics.percentiles.intervals`. |
+| `EditLogFetchTime`*num*`s50thPercentileLatency` | The 50th percentile of 
time spent in fetching edit streams from journal nodes by standby NameNode in 
milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is 
set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
+| `EditLogFetchTime`*num*`s75thPercentileLatency` | The 75th percentile of 
time spent in fetching edit streams from journal nodes by standby NameNode in 
milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is 
set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
+| `EditLogFetchTime`*num*`s90thPercentileLatency` | The 90th percentile of 
time spent in fetching edit streams from journal nodes by standby NameNode in 
milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is 
set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
+| `EditLogFetchTime`*num*`s95thPercentileLatency` | The 95th percentile of 
time spent in fetching edit streams from journal nodes by standby NameNode in 
milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is 
set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
+| `EditLogFetchTime`*num*`s99thPercentileLatency` | The 99th percentile of 
time spent in fetching edit streams from journal nodes by standby NameNode in 
milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is 
set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
{noformat}
Other change looks good to me.

> Improve documentation on Metrics
> 
>
> Key: HDFS-13674
> URL: https://issues.apache.org/jira/browse/HDFS-13674
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HDFS-13674.000.patch
>
>
> There are a few confusing places in the [Hadoop Metrics 
> page|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Metrics.html].
>  For instance, there are duplicated entries such as {{FsImageLoadTime}}; some 
> quantile metrics do not have corresponding entries, description on some 
> quantile metrics are not very specific on what is the {{num}} variable in the 
> metrics name, etc.
> This JIRA targets at improving this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12716) 'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes to be available

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533244#comment-16533244
 ] 

Yiqun Lin commented on HDFS-12716:
--

LGTM,+1. Please wait for the [~brahmareddy] to have a final check, :)

>  'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes 
> to be available
> -
>
> Key: HDFS-12716
> URL: https://issues.apache.org/jira/browse/HDFS-12716
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: usharani
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-12716.002.patch, HDFS-12716.003.patch, 
> HDFS-12716.patch
>
>
>   Currently 'dfs.datanode.failed.volumes.tolerated' supports number of 
> tolerated failed volumes to be mentioned. This configuration change requires 
> restart of datanode. Since datanode volumes can be changed dynamically, 
> keeping this configuration same for all may not be good idea.
> Support 'dfs.datanode.failed.volumes.tolerated' to accept special 
> 'negative value 'x' to tolerate failures of upto "n-x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533218#comment-16533218
 ] 

Yiqun Lin edited comment on HDFS-13710 at 7/5/18 2:58 AM:
--

Two comments:

* Could you add a space after 'Exception' in {{public void testSetQuota() 
throws Exception{}} and {{public void testGetQuotaUsage() throws Exception{}}?
* We can use {{@BeforeClass}} and {{@AfterClass}} to only start Router once. 
Don't need to do the null checking.

Any other comments, [~elgoiri]?


was (Author: linyiqun):
Two comments:

* Could you add a space before 'Exception' in {{public void testSetQuota() 
throws Exception{}} and {{public void testGetQuotaUsage() throws Exception{}}?
* We can use {{@BeforeClass}} and {{@AfterClass}} to only start Router once. 
Don't need to do the null checking.

Any other comments, [~elgoiri]?

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.006.patch, 
> HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533218#comment-16533218
 ] 

Yiqun Lin commented on HDFS-13710:
--

Two comments:

* Could you add a space before 'Exception' in {{public void testSetQuota() 
throws Exception{}} and {{public void testGetQuotaUsage() throws Exception{}}?
* We can use {{@BeforeClass}} and {{@AfterClass}} to only start Router once. 
Don't need to do the null checking.

Any other comments, [~elgoiri]?

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.006.patch, 
> HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-167) Rename KeySpaceManager to OzoneManager

2018-07-04 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533069#comment-16533069
 ] 

Nanda kumar commented on HDDS-167:
--

Uploaded the rebased patch v10 with latest changes.

> Rename KeySpaceManager to OzoneManager
> --
>
> Key: HDDS-167
> URL: https://issues.apache.org/jira/browse/HDDS-167
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Manager
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-167.01.patch, HDDS-167.02.patch, HDDS-167.03.patch, 
> HDDS-167.04.patch, HDDS-167.05.patch, HDDS-167.06.patch, HDDS-167.07.patch, 
> HDDS-167.08.patch, HDDS-167.09.patch, HDDS-167.10.patch
>
>
> The Ozone KeySpaceManager daemon was renamed to OzoneManager. There's some 
> more changes needed to complete the rename everywhere e.g.
> - command-line
> - documentation
> - unit tests
> - Acceptance tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-167) Rename KeySpaceManager to OzoneManager

2018-07-04 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-167:
-
Attachment: HDDS-167.10.patch

> Rename KeySpaceManager to OzoneManager
> --
>
> Key: HDDS-167
> URL: https://issues.apache.org/jira/browse/HDDS-167
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Manager
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-167.01.patch, HDDS-167.02.patch, HDDS-167.03.patch, 
> HDDS-167.04.patch, HDDS-167.05.patch, HDDS-167.06.patch, HDDS-167.07.patch, 
> HDDS-167.08.patch, HDDS-167.09.patch, HDDS-167.10.patch
>
>
> The Ozone KeySpaceManager daemon was renamed to OzoneManager. There's some 
> more changes needed to complete the rename everywhere e.g.
> - command-line
> - documentation
> - unit tests
> - Acceptance tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533063#comment-16533063
 ] 

Hudson commented on HDDS-212:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14525 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14525/])
HDDS-212. Introduce NodeStateManager to manage the state of Datanodes in 
(nanda: rev 71df8c27c9a0e326232d3baf16414a63b5ea5a4b)
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeManager.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/StorageContainerNodeProtocol.java
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeException.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/DatanodeInfo.java
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestContainerPlacement.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/protocol/DatanodeDetails.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfigKeys.java
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeNotFoundException.java
* (edit) hadoop-hdds/common/src/main/resources/ozone-default.xml
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
* (edit) 
hadoop-hdds/common/src/main/proto/StorageContainerLocationProtocol.proto
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/ksm/KeySpaceManager.java
* (edit) hadoop-hdds/common/src/main/proto/hdds.proto
* (edit) 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/client/ContainerOperationClient.java
* (delete) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/HeartbeatQueueItem.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeHeartbeatDispatcher.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/hdds/scm/HddsServerUtil.java
* (add) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeAlreadyExistsException.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestNodeManager.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/protocolPB/StorageContainerLocationProtocolServerSideTranslatorPB.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/ozone/container/testutils/ReplicationNodeManagerMock.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/node/TestQueryNode.java


> Introduce NodeStateManager to manage the state of Datanodes in SCM
> --
>
> Key: HDDS-212
> URL: https://issues.apache.org/jira/browse/HDDS-212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-212.000.patch, HDDS-212.001.patch
>
>
> Introducing {{NodeStateManager}} will make the lifecycle management of 
> datanodes in SCM easy. NodeStateManager will be responsible for marking the 
> datanodes as stale or dead when heartbeat is not received and it will 
> maintain the current state of all the datanodes in the cluster. 
> NodeStateManager should be the only place we maintain node state information, 
> everyone else should use NodeStateManager to know about the state information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDDS-217) Move all SCMEvents to a package

2018-07-04 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533052#comment-16533052
 ] 

Nanda kumar commented on HDDS-217:
--

Thanks [~anu] for working on this. The patch is not applying anymore, can you 
rebase on top of latest changes.

> Move all SCMEvents to a package
> ---
>
> Key: HDDS-217
> URL: https://issues.apache.org/jira/browse/HDDS-217
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-217.001.patch
>
>
> Moving all SCM internal events to a single package; then it is easy to write 
> event producers and consumers easily. Also, we have a single location for all 
> the events. This patch is a simple refactoring patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-212:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Introduce NodeStateManager to manage the state of Datanodes in SCM
> --
>
> Key: HDDS-212
> URL: https://issues.apache.org/jira/browse/HDDS-212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-212.000.patch, HDDS-212.001.patch
>
>
> Introducing {{NodeStateManager}} will make the lifecycle management of 
> datanodes in SCM easy. NodeStateManager will be responsible for marking the 
> datanodes as stale or dead when heartbeat is not received and it will 
> maintain the current state of all the datanodes in the cluster. 
> NodeStateManager should be the only place we maintain node state information, 
> everyone else should use NodeStateManager to know about the state information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533051#comment-16533051
 ] 

Nanda kumar commented on HDDS-212:
--

I have committed this to trunk.

> Introduce NodeStateManager to manage the state of Datanodes in SCM
> --
>
> Key: HDDS-212
> URL: https://issues.apache.org/jira/browse/HDDS-212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-212.000.patch, HDDS-212.001.patch
>
>
> Introducing {{NodeStateManager}} will make the lifecycle management of 
> datanodes in SCM easy. NodeStateManager will be responsible for marking the 
> datanodes as stale or dead when heartbeat is not received and it will 
> maintain the current state of all the datanodes in the cluster. 
> NodeStateManager should be the only place we maintain node state information, 
> everyone else should use NodeStateManager to know about the state information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533043#comment-16533043
 ] 

Nanda kumar commented on HDDS-212:
--

Thanks [~anu] for the review, I will commit this shortly.

> Introduce NodeStateManager to manage the state of Datanodes in SCM
> --
>
> Key: HDDS-212
> URL: https://issues.apache.org/jira/browse/HDDS-212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-212.000.patch, HDDS-212.001.patch
>
>
> Introducing {{NodeStateManager}} will make the lifecycle management of 
> datanodes in SCM easy. NodeStateManager will be responsible for marking the 
> datanodes as stale or dead when heartbeat is not received and it will 
> maintain the current state of all the datanodes in the cluster. 
> NodeStateManager should be the only place we maintain node state information, 
> everyone else should use NodeStateManager to know about the state information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-214) HDDS/Ozone First Release

2018-07-04 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532973#comment-16532973
 ] 

Anu Engineer commented on HDDS-214:
---

+1, the release plan looks good to me. 


> HDDS/Ozone First Release
> 
>
> Key: HDDS-214
> URL: https://issues.apache.org/jira/browse/HDDS-214
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Elek, Marton
>Priority: Major
> Attachments: Ozone 0.2.1 release plan.pdf
>
>
> This is an umbrella JIRA that collects all work items, design discussions, 
> etc. for Ozone's release. We will post a design document soon to open the 
> discussion and nail down the details of the release.
> cc: [~xyao] , [~elek], [~arpitagarwal] [~jnp] , [~msingh] [~nandakumar131], 
> [~bharatviswa]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13310) [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup blocks

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532915#comment-16532915
 ] 

genericqa commented on HDFS-13310:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12090 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
28s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
21s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
49s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} HDFS-12090 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 17m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
14s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 12s{color} | {color:orange} hadoop-hdfs-project: The patch generated 53 new 
+ 693 unchanged - 1 fixed = 746 total (was 694) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
59s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 2 new 
+ 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
28s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}128m  3s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 5s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}240m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  
org.apache.hadoop.hdfs.server.protocol.SyncTaskExecutionResult.getResult() may 
expose internal representation by returning SyncTaskExecutionResult.result  At 
SyncTaskExecutionResult.java:by returning SyncTaskExecutionResult.result  At 
SyncTaskExecutionResult.java:[line 38] |
|  |  new 
org.apache.hadoop.hdfs.server.protocol.SyncTaskExecutionResult(byte[], Long) 
may expose internal representation by 

[jira] [Commented] (HDFS-12716) 'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes to be available

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532894#comment-16532894
 ] 

genericqa commented on HDFS-12716:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-12716 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930290/HDFS-12716.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 6b14d8650025 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3b63715 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24555/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24555/testReport/ |
| Max. process+thread count | 2825 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532890#comment-16532890
 ] 

genericqa commented on HDFS-13719:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
34m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 59s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.web.TestWebHDFSForHA |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13719 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930288/HDFS-13719.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux be67e65ce42c 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3b63715 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24554/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24554/testReport/ |
| Max. process+thread count | 3817 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24554/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Docs around dfs.image.transfer.timeout are misleading

[jira] [Commented] (HDDS-224) Create metrics for Event Watcher

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532851#comment-16532851
 ] 

genericqa commented on HDDS-224:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 26s{color} 
| {color:red} framework in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdds.server.events.TestEventWatcher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDDS-224 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930297/HDDS-224.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux d95069a72fe5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3b63715 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDDS-Build/432/artifact/out/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDDS-Build/432/artifact/out/patch-unit-hadoop-hdds_framework.txt
 |
|  Test Results | 

[jira] [Commented] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532838#comment-16532838
 ] 

genericqa commented on HDDS-212:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 35m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 42m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
54s{color} | {color:red} hadoop-ozone/ozone-manager in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 35m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 35m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 35m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} hadoop-hdds/server-scm generated 3 new + 0 unchanged - 
0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 21s{color} 
| {color:red} common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
44s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} server-scm in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} ozone-manager in the patch passed. {color} |
| 

[jira] [Updated] (HDDS-224) Create metrics for Event Watcher

2018-07-04 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-224:
--
Status: Patch Available  (was: Open)

> Create metrics for Event Watcher
> 
>
> Key: HDDS-224
> URL: https://issues.apache.org/jira/browse/HDDS-224
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-224.001.patch
>
>
> EventWatcher is a common way to track the state of the on-going commands. To 
> make it easier to track the current in-flight commands and the average 
> message processing times the messages should be monitored by Hadoop metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-224) Create metrics for Event Watcher

2018-07-04 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-224:
--
Attachment: HDDS-224.001.patch

> Create metrics for Event Watcher
> 
>
> Key: HDDS-224
> URL: https://issues.apache.org/jira/browse/HDDS-224
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-224.001.patch
>
>
> EventWatcher is a common way to track the state of the on-going commands. To 
> make it easier to track the current in-flight commands and the average 
> message processing times the messages should be monitored by Hadoop metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-224) Create metrics for Event Watcher

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-224:
-

 Summary: Create metrics for Event Watcher
 Key: HDDS-224
 URL: https://issues.apache.org/jira/browse/HDDS-224
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Reporter: Elek, Marton
Assignee: Elek, Marton
 Fix For: 0.2.1


EventWatcher is a common way to track the state of the on-going commands. To 
make it easier to track the current in-flight commands and the average message 
processing times the messages should be monitored by Hadoop metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-223) Create acceptance test for using datanode plugin

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-223:
-

 Summary: Create acceptance test for using datanode plugin
 Key: HDDS-223
 URL: https://issues.apache.org/jira/browse/HDDS-223
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


In the current docker-compose files (both in the hadoop-dist and 
acceptance-test) we use  simplified ozone clusters: there is no namenode and we 
use standalone hdds datanode processes.

To test ozone/hdds as a datanode plugin we need to create separated acceptance 
tests which uses hadoop:3.1 and hadoop:3.0 + ozone hdds datanode plugin artifact



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-222) Remove hdfs command line from ozone distrubution.

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-222:
-

 Summary: Remove hdfs command line from ozone distrubution.
 Key: HDDS-222
 URL: https://issues.apache.org/jira/browse/HDDS-222
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


Az the ozone release artifact doesn't contain a stable namenode/datanode code 
the hdfs command should be removed from the ozone artifact.

ozone-dist-layout-stitching also could be simplified to copy only the required 
jar files (we don't need to copy the namenode/datanode server side jars, just 
the common artifacts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-221) Create acceptance test to test ./start-all.sh for ozone/hdds

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-221:
-

 Summary: Create acceptance test to test ./start-all.sh for 
ozone/hdds
 Key: HDDS-221
 URL: https://issues.apache.org/jira/browse/HDDS-221
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


Usually use the 'ozone' shell command to test our ozone/hdds cluster.

We need to create different acceptance test compose files to test the 
./start-all.sh and ./hadoop-daemon.sh functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-220) Create maven artifacts with the hdds/ozone client proto files

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-220:
-

 Summary: Create maven artifacts with the hdds/ozone client proto 
files
 Key: HDDS-220
 URL: https://issues.apache.org/jira/browse/HDDS-220
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


It would be great to upload all the protofiles required to connect to an 
ozone/hdds cluster to the maven repository as separated artifacts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-219) Genearate version-info.properties for hadoop and ozone

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-219:
-

 Summary: Genearate version-info.properties for hadoop and ozone
 Key: HDDS-219
 URL: https://issues.apache.org/jira/browse/HDDS-219
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


org.apache.hadoop.util.VersionInfo provides an api to show the actual version 
information.

We need to generate hdds-version-info.properties and 
ozone-version-info.properties as part of the build process(most probably in 
hdds/common, ozone/common projects)  and print out the available versions in 
case of 'ozone version' command



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-218) add existing docker-compose files to the ozone release artifact

2018-07-04 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-218:
-

 Summary: add existing docker-compose files to the ozone release 
artifact
 Key: HDDS-218
 URL: https://issues.apache.org/jira/browse/HDDS-218
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Elek, Marton
 Fix For: 0.2.1


Currently we use docker-compose files to run ozone pseudo cluster locally. 
After a full build, they can be found under hadoop-dist/target/compose.

As they are very useful, I propose to make them part of the ozone release to 
make it easier to try out ozone locally. 

I propose to create a new folder (docker/) in the ozone.tar.gz which contains 
all the docker-compose subdirectories + some basic README how they could be 
used.

We should explain in the README that the docker-compose files are not for 
production just for local experiments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-214) HDDS/Ozone First Release

2018-07-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532764#comment-16532764
 ] 

Elek, Marton commented on HDDS-214:
---

I uploaded PDF about the proposed features of Ozone 0.2.1 release. It's just a 
proposal, please comment it.



> HDDS/Ozone First Release
> 
>
> Key: HDDS-214
> URL: https://issues.apache.org/jira/browse/HDDS-214
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Elek, Marton
>Priority: Major
> Attachments: Ozone 0.2.1 release plan.pdf
>
>
> This is an umbrella JIRA that collects all work items, design discussions, 
> etc. for Ozone's release. We will post a design document soon to open the 
> discussion and nail down the details of the release.
> cc: [~xyao] , [~elek], [~arpitagarwal] [~jnp] , [~msingh] [~nandakumar131], 
> [~bharatviswa]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-214) HDDS/Ozone First Release

2018-07-04 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-214:
--
Attachment: Ozone 0.2.1 release plan.pdf

> HDDS/Ozone First Release
> 
>
> Key: HDDS-214
> URL: https://issues.apache.org/jira/browse/HDDS-214
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Elek, Marton
>Priority: Major
> Attachments: Ozone 0.2.1 release plan.pdf
>
>
> This is an umbrella JIRA that collects all work items, design discussions, 
> etc. for Ozone's release. We will post a design document soon to open the 
> discussion and nail down the details of the release.
> cc: [~xyao] , [~elek], [~arpitagarwal] [~jnp] , [~msingh] [~nandakumar131], 
> [~bharatviswa]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12716) 'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes to be available

2018-07-04 Thread Ranith Sardar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532730#comment-16532730
 ] 

Ranith Sardar commented on HDFS-12716:
--

Hi [~brahmareddy] and [~linyiqun], According to your review comment I have 
changed the patch and uploaded it. Please check it.

>  'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes 
> to be available
> -
>
> Key: HDFS-12716
> URL: https://issues.apache.org/jira/browse/HDFS-12716
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: usharani
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-12716.002.patch, HDFS-12716.003.patch, 
> HDFS-12716.patch
>
>
>   Currently 'dfs.datanode.failed.volumes.tolerated' supports number of 
> tolerated failed volumes to be mentioned. This configuration change requires 
> restart of datanode. Since datanode volumes can be changed dynamically, 
> keeping this configuration same for all may not be good idea.
> Support 'dfs.datanode.failed.volumes.tolerated' to accept special 
> 'negative value 'x' to tolerate failures of upto "n-x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12716) 'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes to be available

2018-07-04 Thread Ranith Sardar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar updated HDFS-12716:
-
Attachment: HDFS-12716.003.patch

>  'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes 
> to be available
> -
>
> Key: HDFS-12716
> URL: https://issues.apache.org/jira/browse/HDFS-12716
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: usharani
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-12716.002.patch, HDFS-12716.003.patch, 
> HDFS-12716.patch
>
>
>   Currently 'dfs.datanode.failed.volumes.tolerated' supports number of 
> tolerated failed volumes to be mentioned. This configuration change requires 
> restart of datanode. Since datanode volumes can be changed dynamically, 
> keeping this configuration same for all may not be good idea.
> Support 'dfs.datanode.failed.volumes.tolerated' to accept special 
> 'negative value 'x' to tolerate failures of upto "n-x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-04 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-13719:

Status: Patch Available  (was: Open)

> Docs around dfs.image.transfer.timeout are misleading
> -
>
> Key: HDFS-13719
> URL: https://issues.apache.org/jira/browse/HDFS-13719
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: hdfs
> Attachments: HDFS-13719.001.patch
>
>
> The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter 
> dfs.image.transfer.timeout to HDFS. From the patch (and checking the current 
> code), we can see this parameter governs a socket timeout on the a 
> java.net.HttpURLConnection object:
> {code:java}
> +if (timeout <= 0) {
> +  // Set the ping interval as timeout
> +  Configuration conf = new HdfsConfiguration();
> +  timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
> +  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
> +}
> +
> +if (timeout > 0) {
> +  connection.setConnectTimeout(timeout);
> +  connection.setReadTimeout(timeout);
> +}
> +
> {code}
> In the above 'connection' is a java.net.HttpURLConnection.
> There is a general disbelief in the community that dfs.image.transfer.timeout 
> is the time the entire image must transfer within, however that does not 
> appear to be the case. The timeout is actually the max time the client will 
> block on the socket before giving up if it cannot get data to read. I guess 
> the idea here is to protect the client from hanging forever if the server 
> hangs.
> The docs in hdfs-site.xml are partly what causes this confusion, as they are 
> very misleading:
> {code:xml}
> 
>   dfs.image.transfer.timeout
>   6
>   
> Socket timeout for image transfer in milliseconds. This timeout and 
> the related
> dfs.image.transfer.bandwidthPerSec parameter should be configured such
> that normal image transfer can complete successfully.
> This timeout prevents client hangs when the sender fails during
> image transfer. This is socket timeout during image tranfer.
>   
> 
> {code}
> The start and end of the statement is accurate, but the part "This timeout 
> and the related dfs.image.transfer.bandwidthPerSec parameter should be 
> configured such that normal image transfer can complete successfully." is 
> misleading. There is almost never a reason to change the above in conjunction 
> with the bandwidth setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-04 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-13719:

Attachment: HDFS-13719.001.patch

> Docs around dfs.image.transfer.timeout are misleading
> -
>
> Key: HDFS-13719
> URL: https://issues.apache.org/jira/browse/HDFS-13719
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: hdfs
> Attachments: HDFS-13719.001.patch
>
>
> The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter 
> dfs.image.transfer.timeout to HDFS. From the patch (and checking the current 
> code), we can see this parameter governs a socket timeout on the a 
> java.net.HttpURLConnection object:
> {code:java}
> +if (timeout <= 0) {
> +  // Set the ping interval as timeout
> +  Configuration conf = new HdfsConfiguration();
> +  timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
> +  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
> +}
> +
> +if (timeout > 0) {
> +  connection.setConnectTimeout(timeout);
> +  connection.setReadTimeout(timeout);
> +}
> +
> {code}
> In the above 'connection' is a java.net.HttpURLConnection.
> There is a general disbelief in the community that dfs.image.transfer.timeout 
> is the time the entire image must transfer within, however that does not 
> appear to be the case. The timeout is actually the max time the client will 
> block on the socket before giving up if it cannot get data to read. I guess 
> the idea here is to protect the client from hanging forever if the server 
> hangs.
> The docs in hdfs-site.xml are partly what causes this confusion, as they are 
> very misleading:
> {code:xml}
> 
>   dfs.image.transfer.timeout
>   6
>   
> Socket timeout for image transfer in milliseconds. This timeout and 
> the related
> dfs.image.transfer.bandwidthPerSec parameter should be configured such
> that normal image transfer can complete successfully.
> This timeout prevents client hangs when the sender fails during
> image transfer. This is socket timeout during image tranfer.
>   
> 
> {code}
> The start and end of the statement is accurate, but the part "This timeout 
> and the related dfs.image.transfer.bandwidthPerSec parameter should be 
> configured such that normal image transfer can complete successfully." is 
> misleading. There is almost never a reason to change the above in conjunction 
> with the bandwidth setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-04 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-13719:
---

 Summary: Docs around dfs.image.transfer.timeout are misleading
 Key: HDFS-13719
 URL: https://issues.apache.org/jira/browse/HDFS-13719
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.1.0
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter 
dfs.image.transfer.timeout to HDFS. From the patch (and checking the current 
code), we can see this parameter governs a socket timeout on the a 
java.net.HttpURLConnection object:
{code:java}
+if (timeout <= 0) {
+  // Set the ping interval as timeout
+  Configuration conf = new HdfsConfiguration();
+  timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
+  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
+}
+
+if (timeout > 0) {
+  connection.setConnectTimeout(timeout);
+  connection.setReadTimeout(timeout);
+}
+
{code}
In the above 'connection' is a java.net.HttpURLConnection.

There is a general disbelief in the community that dfs.image.transfer.timeout 
is the time the entire image must transfer within, however that does not appear 
to be the case. The timeout is actually the max time the client will block on 
the socket before giving up if it cannot get data to read. I guess the idea 
here is to protect the client from hanging forever if the server hangs.

The docs in hdfs-site.xml are partly what causes this confusion, as they are 
very misleading:
{code:xml}

  dfs.image.transfer.timeout
  6
  
Socket timeout for image transfer in milliseconds. This timeout and the 
related
dfs.image.transfer.bandwidthPerSec parameter should be configured such
that normal image transfer can complete successfully.
This timeout prevents client hangs when the sender fails during
image transfer. This is socket timeout during image tranfer.
  

{code}
The start and end of the statement is accurate, but the part "This timeout and 
the related dfs.image.transfer.bandwidthPerSec parameter should be configured 
such that normal image transfer can complete successfully." is misleading. 
There is almost never a reason to change the above in conjunction with the 
bandwidth setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13310) [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup blocks

2018-07-04 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532656#comment-16532656
 ] 

Ewan Higgs commented on HDFS-13310:
---

{quote}Can we add javadoc to all the new messages introduced in 
DatanodeProtocol.proto, and all newly added classes (SyncTask).
 Any particular reason for static imports in PBHelper.java? If not, I would 
prefer not declaring these as static imports.
{quote}
004

 - Add Javadoc

 - Remove static imports

 

[~virajith] shall I remove PUT_FILE in this ticket or in a followup?

> [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup 
> blocks
> --
>
> Key: HDFS-13310
> URL: https://issues.apache.org/jira/browse/HDFS-13310
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13310-HDFS-12090.001.patch, 
> HDFS-13310-HDFS-12090.002.patch, HDFS-13310-HDFS-12090.003.patch, 
> HDFS-13310-HDFS-12090.004.patch
>
>
> As part of HDFS-12090, Datanodes should be able to receive DatanodeCommands 
> in the heartbeat response that instructs it to backup a block.
> This should take the form of two sub commands: PUT_FILE (when the file is <=1 
> block in size) and MULTIPART_PUT_PART when part of a Multipart Upload (see 
> HDFS-13186).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13310) [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup blocks

2018-07-04 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13310:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup 
> blocks
> --
>
> Key: HDFS-13310
> URL: https://issues.apache.org/jira/browse/HDFS-13310
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13310-HDFS-12090.001.patch, 
> HDFS-13310-HDFS-12090.002.patch, HDFS-13310-HDFS-12090.003.patch, 
> HDFS-13310-HDFS-12090.004.patch
>
>
> As part of HDFS-12090, Datanodes should be able to receive DatanodeCommands 
> in the heartbeat response that instructs it to backup a block.
> This should take the form of two sub commands: PUT_FILE (when the file is <=1 
> block in size) and MULTIPART_PUT_PART when part of a Multipart Upload (see 
> HDFS-13186).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13310) [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup blocks

2018-07-04 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13310:
--
Attachment: HDFS-13310-HDFS-12090.004.patch

> [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup 
> blocks
> --
>
> Key: HDFS-13310
> URL: https://issues.apache.org/jira/browse/HDFS-13310
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13310-HDFS-12090.001.patch, 
> HDFS-13310-HDFS-12090.002.patch, HDFS-13310-HDFS-12090.003.patch, 
> HDFS-13310-HDFS-12090.004.patch
>
>
> As part of HDFS-12090, Datanodes should be able to receive DatanodeCommands 
> in the heartbeat response that instructs it to backup a block.
> This should take the form of two sub commands: PUT_FILE (when the file is <=1 
> block in size) and MULTIPART_PUT_PART when part of a Multipart Upload (see 
> HDFS-13186).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13310) [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup blocks

2018-07-04 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13310:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] The DatanodeProtocol should be have DNA_BACKUP to backup 
> blocks
> --
>
> Key: HDFS-13310
> URL: https://issues.apache.org/jira/browse/HDFS-13310
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13310-HDFS-12090.001.patch, 
> HDFS-13310-HDFS-12090.002.patch, HDFS-13310-HDFS-12090.003.patch, 
> HDFS-13310-HDFS-12090.004.patch
>
>
> As part of HDFS-12090, Datanodes should be able to receive DatanodeCommands 
> in the heartbeat response that instructs it to backup a block.
> This should take the form of two sub commands: PUT_FILE (when the file is <=1 
> block in size) and MULTIPART_PUT_PART when part of a Multipart Upload (see 
> HDFS-13186).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-212) Introduce NodeStateManager to manage the state of Datanodes in SCM

2018-07-04 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-212:
-
Attachment: HDDS-212.001.patch

> Introduce NodeStateManager to manage the state of Datanodes in SCM
> --
>
> Key: HDDS-212
> URL: https://issues.apache.org/jira/browse/HDDS-212
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-212.000.patch, HDDS-212.001.patch
>
>
> Introducing {{NodeStateManager}} will make the lifecycle management of 
> datanodes in SCM easy. NodeStateManager will be responsible for marking the 
> datanodes as stale or dead when heartbeat is not received and it will 
> maintain the current state of all the datanodes in the cluster. 
> NodeStateManager should be the only place we maintain node state information, 
> everyone else should use NodeStateManager to know about the state information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-62) Cleanup error messages

2018-07-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532583#comment-16532583
 ] 

Elek, Marton edited comment on HDDS-62 at 7/4/18 10:12 AM:
---

If there is no HEALTHY datanode, the rest client doesn't work: 

{code}
docker-compose exec ksm ozone oz -listVolume http://ksm/
{code}

{code}
2018-07-04 08:28:21 WARN  NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2018-07-04 08:28:22 ERROR OzoneClientFactory:295 - Couldn't create protocol 
class org.apache.hadoop.ozone.client.rest.RestClient exception: 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:292)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:248)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:232)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:206)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:188)
at 
org.apache.hadoop.ozone.web.ozShell.Handler.verifyURI(Handler.java:85)
at 
org.apache.hadoop.ozone.web.ozShell.volume.ListVolumeHandler.execute(ListVolumeHandler.java:80)
at org.apache.hadoop.ozone.web.ozShell.Shell.dispatch(Shell.java:395)
at org.apache.hadoop.ozone.web.ozShell.Shell.run(Shell.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:114)
Caused by: java.lang.IllegalArgumentException: bound must be positive
at java.util.Random.nextInt(Random.java:388)
at 
org.apache.hadoop.ozone.client.rest.DefaultRestServerSelector.getRestServer(DefaultRestServerSelector.java:34)
at 
org.apache.hadoop.ozone.client.rest.RestClient.getOzoneRestServerAddress(RestClient.java:199)
at 
org.apache.hadoop.ozone.client.rest.RestClient.(RestClient.java:160)
... 16 more
Command Failed : Couldn't create protocol class 
org.apache.hadoop.ozone.client.rest.RestClient
{code}

The error message should be more clean, IMHO

Do we need a separated jira for this?


was (Author: elek):
If there is no HEALTHY datanode, the rest client doesn't work: 

{code}
docker-compose exec ksm ozone oz -listVolume http://ksm/
{code}

{code}
2018-07-04 08:28:21 WARN  NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2018-07-04 08:28:22 ERROR OzoneClientFactory:295 - Couldn't create protocol 
class org.apache.hadoop.ozone.client.rest.RestClient exception: 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:292)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:248)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:232)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:206)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:188)
at 
org.apache.hadoop.ozone.web.ozShell.Handler.verifyURI(Handler.java:85)
at 
org.apache.hadoop.ozone.web.ozShell.volume.ListVolumeHandler.execute(ListVolumeHandler.java:80)
at org.apache.hadoop.ozone.web.ozShell.Shell.dispatch(Shell.java:395)
at org.apache.hadoop.ozone.web.ozShell.Shell.run(Shell.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:114)
Caused by: java.lang.IllegalArgumentException: bound must be positive
at java.util.Random.nextInt(Random.java:388)
at 

[jira] [Commented] (HDDS-62) Cleanup error messages

2018-07-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532583#comment-16532583
 ] 

Elek, Marton commented on HDDS-62:
--

If there is no HEALTHY datanode, the rest client doesn't work: 

{code}
docker-compose exec ksm ozone oz -listVolume http://ksm/
{code}

{code}
2018-07-04 08:28:21 WARN  NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2018-07-04 08:28:22 ERROR OzoneClientFactory:295 - Couldn't create protocol 
class org.apache.hadoop.ozone.client.rest.RestClient exception: 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:292)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:248)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:232)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:206)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRestClient(OzoneClientFactory.java:188)
at 
org.apache.hadoop.ozone.web.ozShell.Handler.verifyURI(Handler.java:85)
at 
org.apache.hadoop.ozone.web.ozShell.volume.ListVolumeHandler.execute(ListVolumeHandler.java:80)
at org.apache.hadoop.ozone.web.ozShell.Shell.dispatch(Shell.java:395)
at org.apache.hadoop.ozone.web.ozShell.Shell.run(Shell.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:114)
Caused by: java.lang.IllegalArgumentException: bound must be positive
at java.util.Random.nextInt(Random.java:388)
at 
org.apache.hadoop.ozone.client.rest.DefaultRestServerSelector.getRestServer(DefaultRestServerSelector.java:34)
at 
org.apache.hadoop.ozone.client.rest.RestClient.getOzoneRestServerAddress(RestClient.java:199)
at 
org.apache.hadoop.ozone.client.rest.RestClient.(RestClient.java:160)
... 16 more
Command Failed : Couldn't create protocol class 
org.apache.hadoop.ozone.client.rest.RestClient
{code}

Do we need a separated jira for this?

>  Cleanup error messages
> ---
>
> Key: HDDS-62
> URL: https://issues.apache.org/jira/browse/HDDS-62
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
>  Labels: OzonePostMerge
>
> Many error messages thrown from ozone are written for developers by 
> developers. We need to review all publicly visible error messages to make 
> sure it correct, includes enough context (stack traces do not count) and 
> makes sense for the reader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532577#comment-16532577
 ] 

Yiqun Lin commented on HDFS-13583:
--

Sorry for the delay repsonse, [~dibyendu_hadoop]. I am agreed with your 
comment. Some comments from me:

*RouterQuotaUsage*
{code}
-if (getQuota() == HdfsConstants.QUOTA_DONT_SET) {
+if (getQuota() == HdfsConstants.QUOTA_DONT_SET ||
+getQuota() == HdfsConstants.QUOTA_RESET) {
   nsQuota = "-";
   nsCount = "-";
 }
{code}
Actually {{HdfsConstants.QUOTA_DONT_SET}} won't not be set into mount table 
after this patch. We can just use {{HdfsConstants.QUOTA_RESET}} to replace 
{{HdfsConstants.QUOTA_DONT_SET}}.

*RouterAdmin*
{code}
-if (nsQuota <= 0 || ssQuota <= 0) {
+if ((nsQuota <= 0 && nsQuota != HdfsConstants.QUOTA_DONT_SET
+&& nsQuota != HdfsConstants.QUOTA_RESET)
+|| ssQuota <= 0 && ssQuota != HdfsConstants.QUOTA_DONT_SET
+&& ssQuota != HdfsConstants.QUOTA_RESET) {
   throw new IllegalArgumentException(
   "Input quota value should be a positive number.");
 }
{code}
Current check for quota logic should be okay and no need to change. We assume 
users must set a positive number, set to -1 is also not allowed.

*TestRouterAdmin*
The change under this class is not related with current issue. Can we remove 
this change?

*TestRouterAdminCLI*
{code}
+cluster.startCluster();
 // Start routers
 cluster.startRouters();
+cluster.waitClusterUp();
+
+cluster.registerNamenodes();
+cluster.waitNamenodeRegistration();
 
 routerContext = cluster.getRandomRouter();
 Router router = routerContext.getRouter();
@@ -100,12 +102,6 @@ public static void globalSetUp() throws Exception {
 admin = new RouterAdmin(routerConf);
 client = routerContext.getAdminClient();
 
-// Add two fake name services to testing disabling them
-ActiveNamenodeResolver membership = router.getNamenodeResolver();
-membership.registerNamenode(
-createNamenodeReport("ns0", "nn1", HAServiceState.ACTIVE));
-membership.registerNamenode(
-createNamenodeReport("ns1", "nn1", HAServiceState.ACTIVE));
 stateStore.refreshCaches(true);
{code}
This change also can be removed. Use fake name services should be okay. I mean 
we don't need to start full cluster and it will take longer time to run unit 
test.

> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532422#comment-16532422
 ] 

genericqa commented on HDFS-13710:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
33s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13710 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930246/HDFS-13710.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 58fd90bfb26a 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3b63715 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24552/testReport/ |
| Max. process+thread count | 950 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24552/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 

[jira] [Updated] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13528:
-
Fix Version/s: 3.2.0

> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1
>
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532363#comment-16532363
 ] 

Yiqun Lin edited comment on HDFS-13528 at 7/4/18 7:42 AM:
--

The result looks good. 
I have committed this to trunk, branch-3.1 and branch-2. Thanks 
[~dibyendu_hadoop] for the contribution. It's a nice fix, :).


was (Author: linyiqun):
The result looks good. 
I have committed this to trunk and branch-2. Thanks [~dibyendu_hadoop] for the 
contribution. It's a nice fix, :).

> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1
>
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread yanghuafeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532365#comment-16532365
 ] 

yanghuafeng edited comment on HDFS-13710 at 7/4/18 7:17 AM:


{{Thanks for your nits.}}
{quote}I'm not sure there is a point starting a full cluster just for this.

{{I think we should just start a Router and mock whatever is needed.}}
{quote}
{{You are right. I have replaced the cluster with the router to test the case. 
And the time spent is much less than the former. }}
{quote}I'm not sure there's a point having this comment
{quote}
{{the comments indeed does not matter. To make the code clean and simple, I 
removed the comments.}}
{quote}We should do the checkOperation as the first operation even if the quota 
is not enabled.
{quote}
I have adjusted the order of the operations

 

Please review the code again, [~linyiqun] [~elgoiri]

 


was (Author: hfyang20071):
{{Thanks for your nits.}}

{{{quote} }}

{{I'm not sure there is a point starting a full cluster just for this.}}
{{I think we should just start a Router and mock whatever is needed.}}

{{{quote}}}

{{You are right. I have replaced the cluster with the router to test the case. 
And the time spent is much less than the former. }}

{{{quote} }}

I'm not sure there's a point having this comment

{{{quote}}}

{{the comments indeed does not matter. To make the code clean and simple, I 
removed the comments.}}

{{{quote} }}

We should do the checkOperation as the first operation even if the quota is not 
enabled.

{{{quote}}}

I have adjusted the order of the operations.

 

Please review the code again, [~linyiqun] [~elgoiri]

 

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.006.patch, 
> HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13528:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 3.1.1
  2.10.0
Target Version/s: 2.10.0, 3.1.1
  Status: Resolved  (was: Patch Available)

> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.1.1
>
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13528:
-
Affects Version/s: 3.1.0

> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.1.1
>
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread yanghuafeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532365#comment-16532365
 ] 

yanghuafeng commented on HDFS-13710:


{{Thanks for your nits.}}

{{{quote} }}

{{I'm not sure there is a point starting a full cluster just for this.}}
{{I think we should just start a Router and mock whatever is needed.}}

{{{quote}}}

{{You are right. I have replaced the cluster with the router to test the case. 
And the time spent is much less than the former. }}

{{{quote} }}

I'm not sure there's a point having this comment

{{{quote}}}

{{the comments indeed does not matter. To make the code clean and simple, I 
removed the comments.}}

{{{quote} }}

We should do the checkOperation as the first operation even if the quota is not 
enabled.

{{{quote}}}

I have adjusted the order of the operations.

 

Please review the code again, [~linyiqun] [~elgoiri]

 

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.006.patch, 
> HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532363#comment-16532363
 ] 

Yiqun Lin commented on HDFS-13528:
--

The result looks good. 
I have committed this to trunk and branch-2. Thanks [~dibyendu_hadoop] for the 
contribution. It's a nice fix, :).

> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13528) RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries

2018-07-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532359#comment-16532359
 ] 

Hudson commented on HDFS-13528:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14524 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14524/])
HDFS-13528. RBF: If a directory exceeds quota limit then quota usage is (yqlin: 
rev 3b637155a47d2aa93284969a96208347a647083d)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterQuota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaUpdateService.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/Quota.java


> RBF: If a directory exceeds quota limit then quota usage is not refreshed for 
> other mount entries 
> --
>
> Key: HDFS-13528
> URL: https://issues.apache.org/jira/browse/HDFS-13528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13528-000.patch, HDFS-13528-001.patch, 
> HDFS-13528.002.patch
>
>
> If quota limit is exceeded, RouterQuotaUpdateService#periodicInvoke is 
> getting QuotaExceededException and it is not updating the quota usage for 
> rest of the mount table entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread yanghuafeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yanghuafeng updated HDFS-13710:
---
Attachment: HDFS-13710.006.patch

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.006.patch, 
> HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13236) Standby NN down with error encountered while tailing edits

2018-07-04 Thread Yuriy Malygin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532302#comment-16532302
 ] 

Yuriy Malygin commented on HDFS-13236:
--

Hi [~linyiqun], I create additional issue about _NotEnoughReplicasException_ - 
HDFS-13718.

> Standby NN down with error encountered while tailing edits
> --
>
> Key: HDFS-13236
> URL: https://issues.apache.org/jira/browse/HDFS-13236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 3.0.0
>Reporter: Yuriy Malygin
>Priority: Major
>
> After update Hadoop from 2.7.3 to 3.0.0 standby NN down with error 
> encountered while tailing edits from JN:
> {code:java}
> Feb 28 01:58:31 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:31,594 
> INFO [FSImageSaver for /one/hadoop-data/dfs of type IMAGE_AND_EDITS] 
> FSImageFormatProtobuf - Image file 
> /one/hadoop-data/dfs/current/fsimage.ckpt_012748979
> 98 of size 4595971949 bytes saved in 93 seconds.
> Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 
> INFO [Standby State Checkpointer] NNStorageRetentionManager - Going to retain 
> 2 images with txid >= 1274897935
> Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 
> INFO [Standby State Checkpointer] NNStorageRetentionManager - Purging old 
> image 
> FSImageFile(file=/one/hadoop-data/dfs/current/fsimage_01274897875, 
> cpktTxId
> =01274897875)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 
> INFO [Edit log tailer] FSImage - Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6a168e6f 
> expecting start txid #1274897999
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 
> INFO [Edit log tailer] FSImage - Start loading edits file 
> http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A10
> 56233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true, 
> http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848;
> inProgressOk=true
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 
> INFO [Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
> 'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999
> torageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true,
>  
> http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217
> -aef5-6ed206893848=true' to transaction ID 1274897999
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 
> INFO [Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
> 'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true'
>  to transaction ID 1274897999
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,680 
> ERROR [Edit log tailer] FSEditLogLoader - Encountered exception on operation 
> AddOp [length=0, inodeId=145550319, 
> path=/kafka/parquet/infrastructureGrace/date=2018-02-28/_temporary/1/_temporary/attempt_1516181147167_20856_r_98_0/part-r-00098.gz.parquet,
>  replication=3, mtime=1519772206615, atime=1519772206615, 
> blockSize=134217728, blocks=[], permissions=root:supergroup:rw-r--r--, 
> aclEntries=null, 
> clientName=DFSClient_attempt_1516181147167_20856_r_98_0_1523538799_1, 
> clientMachine=10.137.2.142, overwrite=false, RpcClientId=, 
> RpcCallId=271996603, storagePolicyId=0, erasureCodingPolicyId=0, 
> opCode=OP_ADD, txid=1274898002]
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:946)
> Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
> 

[jira] [Created] (HDFS-13718) So many NotEnoughReplicasException in active NN logs

2018-07-04 Thread Yuriy Malygin (JIRA)
Yuriy Malygin created HDFS-13718:


 Summary: So many NotEnoughReplicasException in active NN logs
 Key: HDFS-13718
 URL: https://issues.apache.org/jira/browse/HDFS-13718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Yuriy Malygin


After update Hadoop from 2.7.3 to 3.0.0 I have many messages about replication 
errors (caused by Rack Awareness) in active NN logs:
{code:java}
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: 2018-02-28 01:57:20,804 WARN 
[IPC Server handler 10 on 8020] PmsRackMapping - Got empty rack for 
10.136.2.149, reverting to default.
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: 2018-02-28 01:57:20,806 DEBUG 
[IPC Server handler 10 on 8020] BlockPlacementPolicy - Failed to choose from 
local rack (location = /default-rack); the second replica is not found, retry 
choosing randomly
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:792)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:598)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:558)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:392)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:268)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:121)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:137)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2093)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:287)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2602)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:864)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:549)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
java.security.AccessController.doPrivileged(Native Method)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
javax.security.auth.Subject.doAs(Subject.java:422)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
Feb 28 01:57:20 srvg671 datalab-namenode[1807]: at 
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
{code}
{code:java}
$ zgrep -c NotEnoughReplicasException 
archive/datalab-namenode.log-20180{22*,3*}.gz
archive/datalab-namenode.log-20180228.gz:0   #hadoop 2.7.3 
archive/datalab-namenode.log-20180301.gz:173492  #hadoop 3.0.0
archive/datalab-namenode.log-20180302.gz:153192  #hadoop 3.0.0

[jira] [Updated] (HDFS-13236) Standby NN down with error encountered while tailing edits

2018-07-04 Thread Yuriy Malygin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuriy Malygin updated HDFS-13236:
-
Description: 
After update Hadoop from 2.7.3 to 3.0.0 standby NN down with error encountered 
while tailing edits from JN:
{code:java}
Feb 28 01:58:31 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:31,594 INFO 
[FSImageSaver for /one/hadoop-data/dfs of type IMAGE_AND_EDITS] 
FSImageFormatProtobuf - Image file 
/one/hadoop-data/dfs/current/fsimage.ckpt_012748979
98 of size 4595971949 bytes saved in 93 seconds.
Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 INFO 
[Standby State Checkpointer] NNStorageRetentionManager - Going to retain 2 
images with txid >= 1274897935
Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 INFO 
[Standby State Checkpointer] NNStorageRetentionManager - Purging old image 
FSImageFile(file=/one/hadoop-data/dfs/current/fsimage_01274897875, 
cpktTxId
=01274897875)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 INFO 
[Edit log tailer] FSImage - Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6a168e6f 
expecting start txid #1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 INFO 
[Edit log tailer] FSImage - Start loading edits file 
http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A10
56233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true, 
http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848;
inProgressOk=true
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 INFO 
[Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999
torageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true,
 
http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217
-aef5-6ed206893848=true' to transaction ID 1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 INFO 
[Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup=1274897999=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848=true'
 to transaction ID 1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,680 ERROR 
[Edit log tailer] FSEditLogLoader - Encountered exception on operation AddOp 
[length=0, inodeId=145550319, 
path=/kafka/parquet/infrastructureGrace/date=2018-02-28/_temporary/1/_temporary/attempt_1516181147167_20856_r_98_0/part-r-00098.gz.parquet,
 replication=3, mtime=1519772206615, atime=1519772206615, blockSize=134217728, 
blocks=[], permissions=root:supergroup:rw-r--r--, aclEntries=null, 
clientName=DFSClient_attempt_1516181147167_20856_r_98_0_1523538799_1, 
clientMachine=10.137.2.142, overwrite=false, RpcClientId=, RpcCallId=271996603, 
storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_ADD, txid=1274898002]
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 
java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
length 16
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:946)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 

[jira] [Commented] (HDFS-13710) RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable

2018-07-04 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532297#comment-16532297
 ] 

Yiqun Lin commented on HDFS-13710:
--

{quote}
I'm not sure there is a point starting a full cluster just for this.
I think we should just start a Router and mock whatever is needed.
{quote}
Good idea.  Looks like {{cluster.startRouters();}} is enough and no need to 
start dfs cluster. Following line also can be removed.
{code}
 routerConf.set(RBFConfigKeys.DFS_ROUTER_QUOTA_CACHE_UPATE_INTERVAL, "2s");
{code}
[~hfyang20071], could you address this and minor nits [~elgoiri] mentioned?

> RBF:  setQuota and getQuotaUsage should check the 
> dfs.federation.router.quota.enable
> 
>
> Key: HDFS-13710
> URL: https://issues.apache.org/jira/browse/HDFS-13710
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 2.9.1, 3.0.3
>Reporter: yanghuafeng
>Priority: Major
> Attachments: HDFS-13710.002.patch, HDFS-13710.003.patch, 
> HDFS-13710.004.patch, HDFS-13710.005.patch, HDFS-13710.patch
>
>
> when I use the command below, some exceptions happened.
>  
> {code:java}
> hdfs dfsrouteradmin -setQuota /tmp -ssQuota 1G 
> {code}
>  the logs follow.
> {code:java}
> Successfully set quota for mount point /tmp
> {code}
> It looks like the quota is set successfully, but some exceptions happen in 
> the rbf server log.
> {code:java}
> java.io.IOException: No remote locations available
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1002)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:967)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:940)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:84)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:255)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:238)
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolServerSideTranslatorPB.updateMountTableEntry(RouterAdminProtocolServerSideTranslatorPB.java:179)
> at 
> org.apache.hadoop.hdfs.protocol.proto.RouterProtocolProtos$RouterAdminProtocolService$2.callBlockingMethod(RouterProtocolProtos.java:259)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2111)
> {code}
> I find the dfs.federation.router.quota.enable is false by default. And it 
> causes the problem. I think we should check the parameter when we call 
> setQuota and getQuotaUsage. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org