[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-11-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969903#comment-16969903
 ] 

Hadoop QA commented on HDFS-12288:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestGetFileChecksum |
|   | hadoop.hdfs.server.namenode.TestFSImage |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-12288 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985311/HDFS-12288.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b61cb7db2075 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 42fc888 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28276/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28276/testReport/ |
| Max. process+thread count | 2614 

[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969889#comment-16969889
 ] 

Hadoop QA commented on HDFS-14928:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
37m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14928 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985319/HDFS-14928.004.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 072f1bd170f8 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 42fc888 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 295 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28278/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, 
> NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, 
> RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng updated HDDS-2356:
---
Attachment: 2019-11-06_18_13_57_422_ERROR

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> 

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Li Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969888#comment-16969888
 ] 

Li Cheng commented on HDDS-2356:


[~bharat] Please check the attachment one more time. I re-upload the logs.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> 

[jira] [Updated] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14969:
--
Description: 
In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:
{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}

> Fix HDFS client unnecessary failover log printing
> -
>
> Key: HDFS-14969
> URL: https://issues.apache.org/jira/browse/HDFS-14969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-07 Thread Xudong Cao (Jira)
Xudong Cao created HDFS-14969:
-

 Summary: Fix HDFS client unnecessary failover log printing
 Key: HDFS-14969
 URL: https://issues.apache.org/jira/browse/HDFS-14969
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 3.1.3
Reporter: Xudong Cao
Assignee: Xudong Cao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 6:16 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test. Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state 

[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.001.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14928:
--
Status: Open  (was: Patch Available)

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, 
> NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, 
> RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14928:
--
Attachment: HDFS-14928.004.patch
Status: Patch Available  (was: Open)

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, 
> NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, 
> RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969865#comment-16969865
 ] 

Xieming Li commented on HDFS-14928:
---

Seems like that JPG was downloaded as a patch.
```
HDFS-14928 patch is being downloaded at Fri Nov  8 05:50:42 UTC 2019 from
  https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> 
Downloaded
```

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, 
> NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969865#comment-16969865
 ] 

Xieming Li edited comment on HDFS-14928 at 11/8/19 6:04 AM:


Seems like that JPG was downloaded as a patch.
{code}
HDFS-14928 patch is being downloaded at Fri Nov  8 05:50:42 UTC 2019 from
  https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> 
Downloaded
{code}


was (Author: risyomei):
Seems like that JPG was downloaded as a patch.
```
HDFS-14928 patch is being downloaded at Fri Nov  8 05:50:42 UTC 2019 from
  https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> 
Downloaded
```

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, 
> NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2443) Python client/interface for Ozone

2019-11-07 Thread YiSheng Lien (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien updated HDDS-2443:
---
Description: 
Original ideas: item#25 in 
[https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors]

Ozone Client(Python) for Data Science Notebook such as Jupyter.
 # Size: Large
 # PyArrow: [https://pypi.org/project/pyarrow/]
 # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala 
uses  libhdfs
 
Path to try:
# s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
# python native RPC
# pyarrow + libhdfs, which use the Java client under the hood.

  was:
Original ideas:

Ozone Client(Python) for Data Science Notebook such as Jupyter.
 # Size: Large
 # PyArrow: [https://pypi.org/project/pyarrow/]
 # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala 
uses  libhdfs
 
Path to try:

1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
2. python native RPC
3. pyarrow + libhdfs, which use the Java client under the hood.


> Python client/interface for Ozone
> -
>
> Key: HDDS-2443
> URL: https://issues.apache.org/jira/browse/HDDS-2443
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Client
>Reporter: Li Cheng
>Priority: Major
>
> Original ideas: item#25 in 
> [https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors]
> Ozone Client(Python) for Data Science Notebook such as Jupyter.
>  # Size: Large
>  # PyArrow: [https://pypi.org/project/pyarrow/]
>  # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API 
> Impala uses  libhdfs
>  
> Path to try:
> # s3 interface: Ozone s3 gateway(already supported) + AWS python client 
> (boto3)
> # python native RPC
> # pyarrow + libhdfs, which use the Java client under the hood.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2443) Python client/interface for Ozone

2019-11-07 Thread YiSheng Lien (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien updated HDDS-2443:
---
Description: 
Original ideas:

Ozone Client(Python) for Data Science Notebook such as Jupyter.
 # Size: Large
 # PyArrow: [https://pypi.org/project/pyarrow/]
 # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala 
uses  libhdfs
 
Path to try:

1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
2. python native RPC
3. pyarrow + libhdfs, which use the Java client under the hood.

  was:
Original ideas:

Ozone Client(Python) for Data Science Notebook such as Jupyter.
 # Size: Large
 # PyArrow: [https://pypi.org/project/pyarrow/]
 # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala 
uses                      libhdfs
 # How Jupyter iPython work: 
[https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html]
 # Eco, 
Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/]

 

Path to try:

1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
2. python native RPC
3. pyarrow + libhdfs, which use the Java client under the hood.


> Python client/interface for Ozone
> -
>
> Key: HDDS-2443
> URL: https://issues.apache.org/jira/browse/HDDS-2443
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Client
>Reporter: Li Cheng
>Priority: Major
>
> Original ideas:
> Ozone Client(Python) for Data Science Notebook such as Jupyter.
>  # Size: Large
>  # PyArrow: [https://pypi.org/project/pyarrow/]
>  # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API 
> Impala uses  libhdfs
>  
> Path to try:
> 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client 
> (boto3)
> 2. python native RPC
> 3. pyarrow + libhdfs, which use the Java client under the hood.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-11-07 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969862#comment-16969862
 ] 

Chen Zhang commented on HDFS-12288:
---

Thanks [~elgoiri] for the comments, update the patch v6 according to comment 
1&2.
{quote} * What's the deal with TestNamenodeCapacityReport?{quote}
Before the patch, every live DataNode will report at least 1 xceriver to 
NameNode, but it's not actually the real xceiver, it's just the 
\{{DataXceiverServer}} thread, which is added to \{{threadGroup}} during the 
initialization of DataNode. After the patch, the xceiver count will be the 
*real* number of data-transfer thread, so when the Datanode is idle, it will 
report 0 xceiver count to NameNode.

{\{TestNamenodeCapacityReport}} assumes there would be at least 1 xceiver for 
each DataNode, it leverage this to check the number of cluster's live node, so 
we need to update the test with this patch.

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, 
> HDFS-12288.006.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969854#comment-16969854
 ] 

Hadoop QA commented on HDFS-14928:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 10s{color} 
| {color:red} HDFS-14928 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14928 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28277/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, 
> NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969853#comment-16969853
 ] 

Xieming Li commented on HDFS-14928:
---

[~elgoiri]

Thank you for your detailed review, I have modified the variable names in 
hadoop.css to match naming convention. 

I have also conducted manual test which you can find in the following 
screenshots:

!HDFS-14892-2.jpg!

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, 
> NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14928:
--
Attachment: HDFS-14892-2.jpg

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, 
> HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, 
> NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14928:
--
Status: Open  (was: Patch Available)

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14928.001.patch, HDFS-14928.002.patch, HDFS-14928.003.patch, 
> HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, NN_wo_legend.png, 
> RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.

2019-11-07 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14928:
--
Attachment: HDFS-14928.003.patch
Status: Patch Available  (was: Open)

> UI: unifying the WebUI across different components.
> ---
>
> Key: HDFS-14928
> URL: https://issues.apache.org/jira/browse/HDFS-14928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Trivial
> Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, 
> HDFS-14928.001.patch, HDFS-14928.002.patch, HDFS-14928.003.patch, 
> HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, NN_wo_legend.png, 
> RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png
>
>
> The WebUI of different components could be unified.
> *Router:*
> |Current|  !RBF_orig.png|width=500! | 
> |Proposed 1 (With Icon) |  !RBF_wo_legend.png|width=500! | 
> |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500!  | 
> *NameNode:*
> |Current| !NN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! |
> *DataNode:*
> |Current| !DN_orig.png|width=500! |
> |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! |
> |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969850#comment-16969850
 ] 

Ayush Saxena commented on HDFS-14966:
-

Thanx [~zgw] for the report.
Seems not only negative but the block pool used and DFS Used is also coming 
undefined.
Did you check the JMX values, for capacity used and capacity remaining, what 
are they?
Did you check the Datanodes pages if storage stats are correct there?
Could be an issue with the Namenode calculating stats combining report from 10K 
datanodes.
Any LOG evidences at the namenode?
Is the namenode able to perform all write operations?

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image1.png, image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2399) Update mailing list information in CONTRIBUTION and README files

2019-11-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2399?focusedWorklogId=340359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340359
 ]

ASF GitHub Bot logged work on HDDS-2399:


Author: ASF GitHub Bot
Created on: 08/Nov/19 05:36
Start Date: 08/Nov/19 05:36
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #126: 
HDDS-2399. Update mailing list information.
URL: https://github.com/apache/hadoop-ozone/pull/126
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 340359)
Time Spent: 20m  (was: 10m)

> Update mailing list information in CONTRIBUTION and README files
> 
>
> Key: HDDS-2399
> URL: https://issues.apache.org/jira/browse/HDDS-2399
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have new mailing lists:
>  [ozone-...@hadoop.apache.org|mailto:ozone-...@hadoop.apache.org]
> [ozone-iss...@hadoop.apache.org|mailto:ozone-iss...@hadoop.apache.org]
> [ozone-comm...@hadoop.apache.org|mailto:ozone-comm...@hadoop.apache.org]
>  
> We need to update CONTRIBUTION.md and README.md to use ozone-dev instead of 
> hdfs-dev (optionally we can mention the issues/commits lists, but only in 
> CONTRIBUTION.md)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2399) Update mailing list information in CONTRIBUTION and README files

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2399.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Update mailing list information in CONTRIBUTION and README files
> 
>
> Key: HDDS-2399
> URL: https://issues.apache.org/jira/browse/HDDS-2399
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have new mailing lists:
>  [ozone-...@hadoop.apache.org|mailto:ozone-...@hadoop.apache.org]
> [ozone-iss...@hadoop.apache.org|mailto:ozone-iss...@hadoop.apache.org]
> [ozone-comm...@hadoop.apache.org|mailto:ozone-comm...@hadoop.apache.org]
>  
> We need to update CONTRIBUTION.md and README.md to use ozone-dev instead of 
> hdfs-dev (optionally we can mention the issues/commits lists, but only in 
> CONTRIBUTION.md)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969847#comment-16969847
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/8/19 5:26 AM:
---

I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_MULTIPART_LIST in HDDS-2395.


was (Author: bharatviswa):
I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_ERROR in HDDS-2395.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> 

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969847#comment-16969847
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_ERROR in HDDS-2395.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at 

[jira] [Resolved] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2395.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Handle Ozone S3 completeMPU to match with aws s3 behavior.
> --
>
> Key: HDDS-2395
> URL: https://issues.apache.org/jira/browse/HDDS-2395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> # When uploaded 2 parts, and when complete upload 1 part no error
>  # During complete multipart upload name/part number not matching with 
> uploaded part and part number then InvalidPart error
>  # When parts are not specified in sorted order InvalidPartOrder
>  # During complete multipart upload when no uploaded parts, and we specify 
> some parts then also InvalidPart
>  # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error)
>  # When part 3 uploaded, complete with part 3 can be done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.

2019-11-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2395?focusedWorklogId=340356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340356
 ]

ASF GitHub Bot logged work on HDDS-2395:


Author: ASF GitHub Bot
Created on: 08/Nov/19 05:23
Start Date: 08/Nov/19 05:23
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #109: 
HDDS-2395. Handle completeMPU scenarios to match with aws s3 behavior.
URL: https://github.com/apache/hadoop-ozone/pull/109
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 340356)
Time Spent: 20m  (was: 10m)

> Handle Ozone S3 completeMPU to match with aws s3 behavior.
> --
>
> Key: HDDS-2395
> URL: https://issues.apache.org/jira/browse/HDDS-2395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> # When uploaded 2 parts, and when complete upload 1 part no error
>  # During complete multipart upload name/part number not matching with 
> uploaded part and part number then InvalidPart error
>  # When parts are not specified in sorted order InvalidPartOrder
>  # During complete multipart upload when no uploaded parts, and we specify 
> some parts then also InvalidPart
>  # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error)
>  # When part 3 uploaded, complete with part 3 can be done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969841#comment-16969841
 ] 

Hadoop QA commented on HDFS-14963:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 29 
unchanged - 1 fixed = 29 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}183m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14963 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985297/HDFS-14963.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux e1189a530119 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 

[jira] [Commented] (HDFS-14967) TestWebHDFS - Many test cases are failing in Windows

2019-11-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969827#comment-16969827
 ] 

Ayush Saxena commented on HDFS-14967:
-

Thanx [~prasad-acit] for the report. There are two ways going around. Either we 
close the cluster in the missing test, or take a step ahead pull the 
{{cluster}} variable up as class variable and add an {{@After}} block common 
for all test and close the cluster there. I prefer the later one as it will 
prevent the cluster being open on account of a test getting timeout.

> TestWebHDFS - Many test cases are failing in Windows 
> -
>
> Key: HDFS-14967
> URL: https://issues.apache.org/jira/browse/HDFS-14967
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> In TestWebHDFS test class, few test cases are not closing the MiniDFSCluster, 
> which results in remaining test failures in Windows. Once cluster status is 
> open, all consecutive test cases fail to get the lock on Data dir which 
> results  in test case failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969824#comment-16969824
 ] 

bright.zhou commented on HDFS-14966:


[~hemanthboyina] hvae updated

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image1.png, image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright.zhou updated HDFS-14966:
---
Attachment: image1.png

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image1.png, image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-11-07 Thread Chen Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-12288:
--
Attachment: HDFS-12288.006.patch

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, 
> HDFS-12288.006.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright.zhou updated HDFS-14966:
---
Attachment: image2.png

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright.zhou updated HDFS-14966:
---
Attachment: (was: image2.png)

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright.zhou updated HDFS-14966:
---
Description: Simulate 10k large cluster test HDFS, NameNode web ui show 
wrongly nagtive value.  (was: 
!https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/d4a2a9aa9dd64a78a7ad6dc0247c0ff0/image.png!

!https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/c931073513cf410588e16262543943e5/image.png!

 )

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image2.png
>
>
> Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive 
> value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative

2019-11-07 Thread bright.zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright.zhou updated HDFS-14966:
---
Attachment: image2.png

>  In NameNode Web UI, Some values are shown as negative
> --
>
> Key: HDFS-14966
> URL: https://issues.apache.org/jira/browse/HDFS-14966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 3.1.1
>Reporter: bright.zhou
>Priority: Minor
> Attachments: image2.png
>
>
> !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/d4a2a9aa9dd64a78a7ad6dc0247c0ff0/image.png!
> !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/c931073513cf410588e16262543943e5/image.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14720) DataNode shouldn't report block as bad block if the block length is Long.MAX_VALUE.

2019-11-07 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969821#comment-16969821
 ] 

Surendra Singh Lilhore commented on HDFS-14720:
---

+1 LGTM

[~weichiu], [~hexiaoqiao] any other comment ?

> DataNode shouldn't report block as bad block if the block length is 
> Long.MAX_VALUE.
> ---
>
> Key: HDFS-14720
> URL: https://issues.apache.org/jira/browse/HDFS-14720
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14720.001.patch, HDFS-14720.002.patch
>
>
> {noformat}
> 2019-08-11 09:15:58,092 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Can't replicate block 
> BP-725378529-10.0.0.8-1410027444173:blk_13276745777_1112363330268 because 
> on-disk length 175085 is shorter than NameNode recorded length 
> 9223372036854775807.{noformat}
> If the block length is Long.MAX_VALUE, means file belongs to this block is 
> deleted from the namenode and DN got the command after deletion of file. In 
> this case command should be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.

2019-11-07 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969812#comment-16969812
 ] 

Hudson commented on HDFS-14815:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17618 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17618/])
HDFS-14815. RBF: Update the quota in MountTable when calling setQuota on 
(ayushsaxena: rev 42fc8884ab9763e8778670f301896bf473ecf1d2)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/Quota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterQuota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestDisableRouterQuota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterAdminServer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java


> RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
> --
>
> Key: HDFS-14815
> URL: https://issues.apache.org/jira/browse/HDFS-14815
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, 
> HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch
>
>
> The method setQuota() can make the remote quota(the quota on real clusters) 
> inconsistent with the MountTable. I think we have 3 ways to fix it:
>  # Reject all the setQuota() rpcs if it trys to change the quota of a mount 
> table.
>  # Let setQuota() to change the mount table quota. Update the quota on zk 
> first and then update remote quotas.
>  # Do nothing. The RouterQuotaUpdateService will finally make all the remote 
> quota right. We can tolerate short-term inconsistencies.
> I'd like option 1 because I want the RouterAdmin to be the only entrance to 
> update the MountTable.
> Option 3 we don't need change anything, but the quota will be inconsistent 
> for a short-term. The remote quota will be effective immediately and 
> auto-changed back after a while. User might be confused about the behavior.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.

2019-11-07 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14815:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
> --
>
> Key: HDFS-14815
> URL: https://issues.apache.org/jira/browse/HDFS-14815
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, 
> HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch
>
>
> The method setQuota() can make the remote quota(the quota on real clusters) 
> inconsistent with the MountTable. I think we have 3 ways to fix it:
>  # Reject all the setQuota() rpcs if it trys to change the quota of a mount 
> table.
>  # Let setQuota() to change the mount table quota. Update the quota on zk 
> first and then update remote quotas.
>  # Do nothing. The RouterQuotaUpdateService will finally make all the remote 
> quota right. We can tolerate short-term inconsistencies.
> I'd like option 1 because I want the RouterAdmin to be the only entrance to 
> update the MountTable.
> Option 3 we don't need change anything, but the quota will be inconsistent 
> for a short-term. The remote quota will be effective immediately and 
> auto-changed back after a while. User might be confused about the behavior.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:34 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Commented] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.

2019-11-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969800#comment-16969800
 ] 

Ayush Saxena commented on HDFS-14815:
-

Committed to trunk.

Thanx [~LiJinglun] for the contribution and [~elgoiri]  for the review!!!

> RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
> --
>
> Key: HDFS-14815
> URL: https://issues.apache.org/jira/browse/HDFS-14815
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, 
> HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch
>
>
> The method setQuota() can make the remote quota(the quota on real clusters) 
> inconsistent with the MountTable. I think we have 3 ways to fix it:
>  # Reject all the setQuota() rpcs if it trys to change the quota of a mount 
> table.
>  # Let setQuota() to change the mount table quota. Update the quota on zk 
> first and then update remote quotas.
>  # Do nothing. The RouterQuotaUpdateService will finally make all the remote 
> quota right. We can tolerate short-term inconsistencies.
> I'd like option 1 because I want the RouterAdmin to be the only entrance to 
> update the MountTable.
> Option 3 we don't need change anything, but the quota will be inconsistent 
> for a short-term. The remote quota will be effective immediately and 
> auto-changed back after a while. User might be confused about the behavior.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:28 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:27 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:26 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: begin with 1st NN. And if trylock() 
failed while writing, it simply returns and continues. In fact, I think both 
these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 

[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao commented on HDFS-14963:
---

cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: begin with 1st NN. And if trylock() 
failed while writing, it simply returns and continues. In fact, I think both 
these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14590) [SBN Read] Add the document link to the top page

2019-11-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969787#comment-16969787
 ] 

Konstantin Shvachko commented on HDFS-14590:


Thanks, this is great.

> [SBN Read] Add the document link to the top page
> 
>
> Key: HDFS-14590
> URL: https://issues.apache.org/jira/browse/HDFS-14590
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0
>
> Attachments: HDFS-14590.001.patch, HDFS-14590.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-07 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun reassigned HDFS-13811:
--

Assignee: Jinglun  (was: Dibyendu Karmakar)

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2443) Python client/interface for Ozone

2019-11-07 Thread Li Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969771#comment-16969771
 ] 

Li Cheng commented on HDDS-2443:


Prototyping with S3 gateway + boto3 now. Reads, Writes and Deletes can be done. 
Large object read may need some tweak. 

Only concern is when it's only uploading files to S3, it shows read timeout 
towards ozone endpoint: 

ReadTimeoutError: Read timeout on endpoint URL: 
"http://localhost:9878/ozone-test/./20191011/plc_1570784946653_2774;

> Python client/interface for Ozone
> -
>
> Key: HDDS-2443
> URL: https://issues.apache.org/jira/browse/HDDS-2443
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Client
>Reporter: Li Cheng
>Priority: Major
>
> Original ideas:
> Ozone Client(Python) for Data Science Notebook such as Jupyter.
>  # Size: Large
>  # PyArrow: [https://pypi.org/project/pyarrow/]
>  # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API 
> Impala uses                      libhdfs
>  # How Jupyter iPython work: 
> [https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html]
>  # Eco, 
> Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/]
>  
> Path to try:
> 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client 
> (boto3)
> 2. python native RPC
> 3. pyarrow + libhdfs, which use the Java client under the hood.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2443) Python client/interface for Ozone

2019-11-07 Thread Li Cheng (Jira)
Li Cheng created HDDS-2443:
--

 Summary: Python client/interface for Ozone
 Key: HDDS-2443
 URL: https://issues.apache.org/jira/browse/HDDS-2443
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
  Components: Ozone Client
Reporter: Li Cheng


Original ideas:

Ozone Client(Python) for Data Science Notebook such as Jupyter.
 # Size: Large
 # PyArrow: [https://pypi.org/project/pyarrow/]
 # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala 
uses                      libhdfs
 # How Jupyter iPython work: 
[https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html]
 # Eco, 
Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/]

 

Path to try:

1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
2. python native RPC
3. pyarrow + libhdfs, which use the Java client under the hood.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-11-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969756#comment-16969756
 ] 

Íñigo Goiri commented on HDFS-14908:


It would be nice to have somebody else to double check.
We can give it a week or so.

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, Test.java, 
> TestV2.java, TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-11-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969755#comment-16969755
 ] 

Íñigo Goiri commented on HDFS-12288:


This looks good and I think that it covers the initial concerns.

A couple minor comments on the test:
* TestDataNodeMetrics#387 should have expected first.
* Can we check something a little more for the asserts in 410 and 411? If 
everything is 0 this will still pass, we should have some range.
* What's the deal with TestNamenodeCapacityReport?

I would like for [~shahrs87] and [~lukmajercak] to double check.

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-11-07 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969752#comment-16969752
 ] 

Jinglun commented on HDFS-14908:


Hi [~elgoiri], would you help to commit v05 ? Do we need another review ?

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, Test.java, 
> TestV2.java, TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.000.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: (was: HDFS-14963.000.patch)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.000.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13613) RegionServer log is flooded with "Execution rejected, Executing in current thread"

2019-11-07 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969719#comment-16969719
 ] 

Duo Zhang commented on HDFS-13613:
--

+1 on removing the log. I believe if this happens, it will always flood the log 
file, as this means we are overloaded, and flood the log file with this message 
will make things even worse...

A better way is to implement a counter for this event, and log periodically 
about this event with some numbers. Can do this in a follow on issue?

> RegionServer log is flooded with "Execution rejected, Executing in current 
> thread"
> --
>
> Key: HDFS-13613
> URL: https://issues.apache.org/jira/browse/HDFS-13613
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
> Environment: CDH 5.13, HBase RegionServer, Kerberized, hedged read
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: 
> 0001-HDFS-13613-RegionServer-log-is-flooded-with-Executio.patch
>
>
> In the log of a HBase RegionServer with hedged read, we saw the following 
> message flooding the log file.
> {noformat}
> 2018-05-19 17:22:55,691 INFO org.apache.hadoop.hdfs.DFSClient: Execution 
> rejected, Executing in current thread
> 2018-05-19 17:22:55,692 INFO org.apache.hadoop.hdfs.DFSClient: Execution 
> rejected, Executing in current thread
> 2018-05-19 17:22:55,695 INFO org.apache.hadoop.hdfs.DFSClient: Execution 
> rejected, Executing in current thread
> 2018-05-19 17:22:55,696 INFO org.apache.hadoop.hdfs.DFSClient: Execution 
> rejected, Executing in current thread
> 2018-05-19 17:22:55,696 INFO org.apache.hadoop.hdfs.DFSClient: Execution 
> rejected, Executing in current thread
> 
> {noformat}
> Sometimes the RS spits tens of thousands of lines of this message in a 
> minute. We should do something to stop this message flooding the log file. 
> Also, we should make this message more actionable. Discussed with 
> [~huaxiang], this message can appear if there are stale DataNodes.
> I believe this issue existed since HDFS-5776.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-11-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969682#comment-16969682
 ] 

Hadoop QA commented on HDFS-14854:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 11 new + 464 unchanged - 5 fixed = 475 total (was 469) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14854 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985278/HDFS-14854.014.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 75453069aebf 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 247584e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28274/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28274/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  

[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969677#comment-16969677
 ] 

Konstantin Shvachko commented on HDFS-14963:


We do have some randomization of the proxies on the client via 
{{dfs.client.failover.random.order}}. So it does not need to always  start from 
the first one.
Caching last active NN for the client could be useful, but using /tmp is not 
good. With ZKFC you should have that already, right?

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14529) NPE while Loading the Editlogs

2019-11-07 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969659#comment-16969659
 ] 

Tsz-wo Sze commented on HDFS-14529:
---

Not sure if HDFS-14807 could fix the NPE.

> NPE while Loading the Editlogs
> --
>
> Key: HDFS-14529
> URL: https://issues.apache.org/jira/browse/HDFS-14529
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
>
> {noformat}
> 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception 
> on operation TimesOp [length=0, 
> path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, 
> atime=1559294343288, opCode=OP_TIMES, txid=18927893]
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181)
> at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2404) Add support for Registered id as service identifier for CSR.

2019-11-07 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2404.

Fix Version/s: 0.5.0
   Resolution: Fixed

Committed to the master.

> Add support for Registered id as service identifier for CSR.
> 
>
> Key: HDDS-2404
> URL: https://issues.apache.org/jira/browse/HDDS-2404
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Anu Engineer
>Assignee: Abhishek Purohit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The SCM HA needs the ability to represent a group as a single entity. So that 
> Tokens for each of the OM which is part of an HA group can be honored by the 
> data nodes. 
> This patch adds the notion of a service group ID to the Certificate 
> Infrastructure. In the next JIRAs, we will use this capability when issuing 
> certificates to OM -- especially when they are in HA mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2442) Add ServiceName support for Certificate Signing Request.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2442:
--

 Summary: Add ServiceName support for Certificate Signing Request.
 Key: HDDS-2442
 URL: https://issues.apache.org/jira/browse/HDDS-2442
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: SCM
Reporter: Anu Engineer
Assignee: Abhishek Purohit


We need to add support for adding Service name into the Certificate Signing 
Request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2404) Add support for Registered id as service identifier for CSR.

2019-11-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2404?focusedWorklogId=340231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340231
 ]

ASF GitHub Bot logged work on HDDS-2404:


Author: ASF GitHub Bot
Created on: 07/Nov/19 23:32
Start Date: 07/Nov/19 23:32
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #116: HDDS-2404. 
Added support for Registered id as service identifier for …
URL: https://github.com/apache/hadoop-ozone/pull/116
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 340231)
Time Spent: 20m  (was: 10m)

> Add support for Registered id as service identifier for CSR.
> 
>
> Key: HDDS-2404
> URL: https://issues.apache.org/jira/browse/HDDS-2404
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Anu Engineer
>Assignee: Abhishek Purohit
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The SCM HA needs the ability to represent a group as a single entity. So that 
> Tokens for each of the OM which is part of an HA group can be honored by the 
> data nodes. 
> This patch adds the notion of a service group ID to the Certificate 
> Infrastructure. In the next JIRAs, we will use this capability when issuing 
> certificates to OM -- especially when they are in HA mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2442) Add ServiceName support for Certificate Signing Request.

2019-11-07 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-2442:
---
Parent: HDDS-505
Issue Type: Sub-task  (was: Improvement)

> Add ServiceName support for Certificate Signing Request.
> 
>
> Key: HDDS-2442
> URL: https://issues.apache.org/jira/browse/HDDS-2442
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Anu Engineer
>Assignee: Abhishek Purohit
>Priority: Major
>
> We need to add support for adding Service name into the Certificate Signing 
> Request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-11-07 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969641#comment-16969641
 ] 

Chen Zhang commented on HDFS-12288:
---

The failed tests are unrelated. [~elgoiri] do you have time to help review? 
Thanks.

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2422) Add robot tests for list-trash command.

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2422:
---

Assignee: Matthew Sharp

> Add robot tests for list-trash command.
> ---
>
> Key: HDDS-2422
> URL: https://issues.apache.org/jira/browse/HDDS-2422
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: test
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add robot tests for list-trash command and add those tests to integration.sh 
> so these commands are run as part of CI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2418) Add the list trash command server side handling.

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2418:
---

Assignee: Matthew Sharp

> Add the list trash command server side handling.
> 
>
> Key: HDDS-2418
> URL: https://issues.apache.org/jira/browse/HDDS-2418
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add the standard code for any command handling in the server side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2420) Add the Ozone shell support for list-trash command.

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2420:
---

Assignee: Matthew Sharp

> Add the Ozone shell support for list-trash command.
> ---
>
> Key: HDDS-2420
> URL: https://issues.apache.org/jira/browse/HDDS-2420
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone CLI
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add support for list-trash command in Ozone CLI. Please see the attached 
> design doc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2419) Add the core logic to process list trash command.

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2419:
---

Assignee: Matthew Sharp

> Add the core logic to process list trash command.
> -
>
> Key: HDDS-2419
> URL: https://issues.apache.org/jira/browse/HDDS-2419
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add the core logic of reading from the deleted table, and return the entries 
> that match the user query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2382) Consider reducing number of file::exists() calls during write operation

2019-11-07 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-2382:
-

Assignee: Aravindan Vijayan  (was: Siddharth Wagle)

We do need to verify the behavior if the chunksPath is deleted. One way is to 
fail late and make sure the behavior is consistent.

> Consider reducing number of file::exists() calls during write operation
> ---
>
> Key: HDDS-2382
> URL: https://issues.apache.org/jira/browse/HDDS-2382
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Rajesh Balamohan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: performance
>
> When writing 100-200 MB files with multiple threads, observed lots of 
> {{[file::exists(])}} checks.
> For every 16 MB chunk, it ends up checking whether {{chunksLoc}} directory 
> exists or not. (ref: 
> [https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L239])
> Also, this check ({{ChunkUtils.getChunkFile}}) happens from 2 places.
> 1.org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$handleWriteChunk
> 2.org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction
> Note that these are folders and not actual chunk filenames. It would be 
> helpful to reduce this check, if we track create/delete of these folders.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2421) Add documentation for list trash command.

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2421:
---

Assignee: Matthew Sharp

> Add documentation for list trash command.
> -
>
> Key: HDDS-2421
> URL: https://issues.apache.org/jira/browse/HDDS-2421
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add documentation about the list-trash command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2441) Add documentation for Empty-Trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2441:
--

 Summary: Add documentation for Empty-Trash command.
 Key: HDDS-2441
 URL: https://issues.apache.org/jira/browse/HDDS-2441
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: documentation
Reporter: Anu Engineer


Add documentation for empty-trash command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2440) Add empty-trash to ozone shell.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2440:
--

 Summary: Add empty-trash to ozone shell.
 Key: HDDS-2440
 URL: https://issues.apache.org/jira/browse/HDDS-2440
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone CLI
Reporter: Anu Engineer


Add emptry-trash command to Ozone shell. We should decide if we want to add 
this to the admin shell or normal shell.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2439) Add robot tests for empty-trash as owner.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2439:
--

 Summary: Add robot tests for empty-trash as owner.
 Key: HDDS-2439
 URL: https://issues.apache.org/jira/browse/HDDS-2439
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


We need to make sure that only Owner or Admins can execute the empty-trash 
command. We need to verify this using end-to-end tests, for example, robot tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2438) Add the core logic for empty-trash

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2438:
--

 Summary: Add the core logic for empty-trash
 Key: HDDS-2438
 URL: https://issues.apache.org/jira/browse/HDDS-2438
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2437) Restrict empty-trash to admins and owners only

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2437:
--

 Summary: Restrict empty-trash to admins and owners only
 Key: HDDS-2437
 URL: https://issues.apache.org/jira/browse/HDDS-2437
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Make sure that only the owner of a key/adminstrator can empty-trash. The delete 
ACL is not enough to empty-trash. This is becasue a shared bucket can have 
deletes but the owner should be able to recover them. Once empty-trash is 
executed even the owner will be able to recover the deleted keys




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2436) Add security profile support for empty-trash command

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2436:
--

 Summary: Add security profile support for empty-trash command
 Key: HDDS-2436
 URL: https://issues.apache.org/jira/browse/HDDS-2436
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add support for a certain groups to have the ability to have empty-trash. It 
might be the case where we want this command only to be run by admins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2417) Add the list trash command to the client side

2019-11-07 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned HDDS-2417:
---

Assignee: Matthew Sharp

> Add the list trash command to the client side
> -
>
> Key: HDDS-2417
> URL: https://issues.apache.org/jira/browse/HDDS-2417
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Anu Engineer
>Assignee: Matthew Sharp
>Priority: Major
>
> Add the list-trash command to the protobuf files and to the client side 
> translator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2435) Add the ability to disable empty-trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2435:
--

 Summary: Add the ability to disable empty-trash command.
 Key: HDDS-2435
 URL: https://issues.apache.org/jira/browse/HDDS-2435
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Anu Engineer


Add a configuration key to disable empty-trash command. We can discuss if this 
should be a system-wide setting or per bucket. It is easier to do this 
system-wide I guess.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2434) Add server side support for empty-trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2434:
--

 Summary: Add server side support for empty-trash command.
 Key: HDDS-2434
 URL: https://issues.apache.org/jira/browse/HDDS-2434
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add server side support for empty-trash command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2433) Add client side support for the empty-trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2433:
--

 Summary: Add client side support for the empty-trash command.
 Key: HDDS-2433
 URL: https://issues.apache.org/jira/browse/HDDS-2433
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add client side support for the empty-trash command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2432) Add documentation for the recover-trash

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2432:
--

 Summary: Add documentation for the recover-trash
 Key: HDDS-2432
 URL: https://issues.apache.org/jira/browse/HDDS-2432
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: documentation
Reporter: Anu Engineer


Add documentation for the recover-trash command in Ozone Documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2431) Add recover-trash command to the ozone shell.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2431:
--

 Summary: Add recover-trash command to the ozone shell.
 Key: HDDS-2431
 URL: https://issues.apache.org/jira/browse/HDDS-2431
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone CLI
Reporter: Anu Engineer


Add recover-trash command to the Ozone CLI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2427:


Assignee: Bharat Viswanadham

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2427:
-
Labels: pull-request-available  (was: )

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2427:
-
Status: Patch Available  (was: Open)

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?focusedWorklogId=340211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340211
 ]

ASF GitHub Bot logged work on HDDS-2427:


Author: ASF GitHub Bot
Created on: 07/Nov/19 22:44
Start Date: 07/Nov/19 22:44
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #129: 
HDDS-2427. Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
URL: https://github.com/apache/hadoop-ozone/pull/129
 
 
   ## What changes were proposed in this pull request?
   
   Exclude web apps from filesystem uber jar.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2427
   
   
   ## How was this patch tested?
   
   This has caused DN UI loading, now with placing the fixed jar UI is able to 
load up.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 340211)
Remaining Estimate: 0h
Time Spent: 10m

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2430) Recover-trash should warn and skip if at-rest encryption is enabled and keys are missing.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2430:
--

 Summary: Recover-trash should warn and skip if at-rest encryption 
is enabled and keys are missing.
 Key: HDDS-2430
 URL: https://issues.apache.org/jira/browse/HDDS-2430
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


If TDE is enabled, recovering a key is useful only if the actual keys that are 
used for encryption are still recoverable. We should warn and fail the recovery 
if the actual keys are missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2429) Recover-trash should warn and skip if the key is GDPR-ed key that recovery is pointless since the encryption keys are lost.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2429:
--

 Summary: Recover-trash should warn and skip if the key is GDPR-ed 
key that recovery is pointless since the encryption keys are lost.
 Key: HDDS-2429
 URL: https://issues.apache.org/jira/browse/HDDS-2429
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Anu Engineer


If a bucket has GDPR enabled set, then it means that keys used to recover the 
data from the blocks is irrecoverably lost. In that case, a recover from trash 
is pointless. The recover-trash command should detect this case and let the 
users know about it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2428) Rename a recovered file as .recovered if the file already exists in the target bucket.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2428:
--

 Summary: Rename a recovered file as .recovered if the file already 
exists in the target bucket.
 Key: HDDS-2428
 URL: https://issues.apache.org/jira/browse/HDDS-2428
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Anu Engineer


During recovery if the file name exists in the bucket, then the new key that is 
being recovered should be automatically renamed. The proposal is to rename it 
as key.recovered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2426) Support recover-trash to an existing bucket.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2426:
--

 Summary:  Support recover-trash to an existing bucket.
 Key: HDDS-2426
 URL: https://issues.apache.org/jira/browse/HDDS-2426
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Support recovering trash to an existing bucket. We should also add a config key 
that prevents this mode, so admins can force the recovery to a new bucket 
always.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2427:


 Summary: Exclude webapps from hadoop-ozone-filesystem-lib-current 
uber jar
 Key: HDDS-2427
 URL: https://issues.apache.org/jira/browse/HDDS-2427
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


This has caused issue for DN UI loading.

hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
accidentally loaded Ozone datanode web application instead of Hadoop datanode 
application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2425) Support the ability to recover-trash to a new bucket.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2425:
--

 Summary: Support the ability to recover-trash to a new bucket.
 Key: HDDS-2425
 URL: https://issues.apache.org/jira/browse/HDDS-2425
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


recover-trash can be run to recover to an existing bucket or to a new bucket. 
If the bucket does not exist, the recover-trash command should create that 
bucket automatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2424) Add the recover-trash command server side handling.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2424:
--

 Summary: Add the recover-trash command server side handling.
 Key: HDDS-2424
 URL: https://issues.apache.org/jira/browse/HDDS-2424
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add the standard server side code for command handling.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2423) Add the recover-trash command client side code

2019-11-07 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-2423:
---
Description: Add protobuf, RpcClient and ClientSideTranslator code for the 
recover-trash command.  (was: Add protobuf, RpcClient and ClientSideTranslator 
code for the Empty-trash command.)

> Add the recover-trash command client side code
> --
>
> Key: HDDS-2423
> URL: https://issues.apache.org/jira/browse/HDDS-2423
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Anu Engineer
>Priority: Major
>
> Add protobuf, RpcClient and ClientSideTranslator code for the recover-trash 
> command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2423) Add the recover-trash command client side code

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2423:
--

 Summary: Add the recover-trash command client side code
 Key: HDDS-2423
 URL: https://issues.apache.org/jira/browse/HDDS-2423
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add protobuf, RpcClient and ClientSideTranslator code for the Empty-trash 
command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2422) Add robot tests for list-trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2422:
--

 Summary: Add robot tests for list-trash command.
 Key: HDDS-2422
 URL: https://issues.apache.org/jira/browse/HDDS-2422
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: test
Reporter: Anu Engineer


Add robot tests for list-trash command and add those tests to integration.sh so 
these commands are run as part of CI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2421) Add documentation for list trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2421:
--

 Summary: Add documentation for list trash command.
 Key: HDDS-2421
 URL: https://issues.apache.org/jira/browse/HDDS-2421
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: documentation
Reporter: Anu Engineer


Add documentation about the list-trash command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2420) Add the Ozone shell support for list-trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2420:
--

 Summary: Add the Ozone shell support for list-trash command.
 Key: HDDS-2420
 URL: https://issues.apache.org/jira/browse/HDDS-2420
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone CLI
Reporter: Anu Engineer


Add support for list-trash command in Ozone CLI. Please see the attached design 
doc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2418) Add the list trash command server side handling.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2418:
--

 Summary: Add the list trash command server side handling.
 Key: HDDS-2418
 URL: https://issues.apache.org/jira/browse/HDDS-2418
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add the standard code for any command handling in the server side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2419) Add the core logic to process list trash command.

2019-11-07 Thread Anu Engineer (Jira)
Anu Engineer created HDDS-2419:
--

 Summary: Add the core logic to process list trash command.
 Key: HDDS-2419
 URL: https://issues.apache.org/jira/browse/HDDS-2419
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Anu Engineer


Add the core logic of reading from the deleted table, and return the entries 
that match the user query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   >