[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
[ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969903#comment-16969903 ] Hadoop QA commented on HDFS-12288: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 13s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}173m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.server.namenode.TestFSImage | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-12288 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985311/HDFS-12288.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b61cb7db2075 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 42fc888 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28276/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28276/testReport/ | | Max. process+thread count | 2614
[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969889#comment-16969889 ] Hadoop QA commented on HDFS-14928: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 37m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14928 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985319/HDFS-14928.004.patch | | Optional Tests | dupname asflicense shadedclient | | uname | Linux 072f1bd170f8 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 42fc888 | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 295 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28278/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, > NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, > RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Cheng updated HDDS-2356: --- Attachment: 2019-11-06_18_13_57_422_ERROR > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, > image-2019-10-31-18-56-56-177.png > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66) > at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source) > at >
[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969888#comment-16969888 ] Li Cheng commented on HDDS-2356: [~bharat] Please check the attachment one more time. I re-upload the logs. > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, > image-2019-10-31-18-56-56-177.png > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66) > at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source) > at >
[jira] [Updated] (HDFS-14969) Fix HDFS client unnecessary failover log printing
[ https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-14969: -- Description: In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and then a client starts rpc with the 1st NN, it will be silent when failover from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it prints some unnecessary logs, in some scenarios, these logs will be very numerous: {code:java} 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) ...{code} > Fix HDFS client unnecessary failover log printing > - > > Key: HDFS-14969 > URL: https://issues.apache.org/jira/browse/HDFS-14969 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14969) Fix HDFS client unnecessary failover log printing
Xudong Cao created HDFS-14969: - Summary: Fix HDFS client unnecessary failover log printing Key: HDFS-14969 URL: https://issues.apache.org/jira/browse/HDFS-14969 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.1.3 Reporter: Xudong Cao Assignee: Xudong Cao -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao edited comment on HDFS-14963 at 11/8/19 6:16 AM: cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return an index 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. Of course in all abnormal situations there will be a WARN log. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test. Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. was (Author: xudongcao): cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return an index 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. Of course in all abnormal situations there will be a WARN log. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state
[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-14963: -- Attachment: HDFS-14963.001.patch > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14928: -- Status: Open (was: Patch Available) > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, > NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, > RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14928: -- Attachment: HDFS-14928.004.patch Status: Patch Available (was: Open) > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.004.patch, HDFS-14928.jpg, NN_orig.png, > NN_with_legend.png, NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, > RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969865#comment-16969865 ] Xieming Li commented on HDFS-14928: --- Seems like that JPG was downloaded as a patch. ``` HDFS-14928 patch is being downloaded at Fri Nov 8 05:50:42 UTC 2019 from https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> Downloaded ``` > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, > NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969865#comment-16969865 ] Xieming Li edited comment on HDFS-14928 at 11/8/19 6:04 AM: Seems like that JPG was downloaded as a patch. {code} HDFS-14928 patch is being downloaded at Fri Nov 8 05:50:42 UTC 2019 from https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> Downloaded {code} was (Author: risyomei): Seems like that JPG was downloaded as a patch. ``` HDFS-14928 patch is being downloaded at Fri Nov 8 05:50:42 UTC 2019 from https://issues.apache.org/jira/secure/attachment/12985317/HDFS-14892-2.jpg -> Downloaded ``` > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, > NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2443) Python client/interface for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien updated HDDS-2443: --- Description: Original ideas: item#25 in [https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors] Ozone Client(Python) for Data Science Notebook such as Jupyter. # Size: Large # PyArrow: [https://pypi.org/project/pyarrow/] # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs Path to try: # s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3) # python native RPC # pyarrow + libhdfs, which use the Java client under the hood. was: Original ideas: Ozone Client(Python) for Data Science Notebook such as Jupyter. # Size: Large # PyArrow: [https://pypi.org/project/pyarrow/] # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs Path to try: 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3) 2. python native RPC 3. pyarrow + libhdfs, which use the Java client under the hood. > Python client/interface for Ozone > - > > Key: HDDS-2443 > URL: https://issues.apache.org/jira/browse/HDDS-2443 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Client >Reporter: Li Cheng >Priority: Major > > Original ideas: item#25 in > [https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors] > Ozone Client(Python) for Data Science Notebook such as Jupyter. > # Size: Large > # PyArrow: [https://pypi.org/project/pyarrow/] > # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API > Impala uses libhdfs > > Path to try: > # s3 interface: Ozone s3 gateway(already supported) + AWS python client > (boto3) > # python native RPC > # pyarrow + libhdfs, which use the Java client under the hood. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2443) Python client/interface for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien updated HDDS-2443: --- Description: Original ideas: Ozone Client(Python) for Data Science Notebook such as Jupyter. # Size: Large # PyArrow: [https://pypi.org/project/pyarrow/] # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs Path to try: 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3) 2. python native RPC 3. pyarrow + libhdfs, which use the Java client under the hood. was: Original ideas: Ozone Client(Python) for Data Science Notebook such as Jupyter. # Size: Large # PyArrow: [https://pypi.org/project/pyarrow/] # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs # How Jupyter iPython work: [https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html] # Eco, Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/] Path to try: 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3) 2. python native RPC 3. pyarrow + libhdfs, which use the Java client under the hood. > Python client/interface for Ozone > - > > Key: HDDS-2443 > URL: https://issues.apache.org/jira/browse/HDDS-2443 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Client >Reporter: Li Cheng >Priority: Major > > Original ideas: > Ozone Client(Python) for Data Science Notebook such as Jupyter. > # Size: Large > # PyArrow: [https://pypi.org/project/pyarrow/] > # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API > Impala uses libhdfs > > Path to try: > 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client > (boto3) > 2. python native RPC > 3. pyarrow + libhdfs, which use the Java client under the hood. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
[ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969862#comment-16969862 ] Chen Zhang commented on HDFS-12288: --- Thanks [~elgoiri] for the comments, update the patch v6 according to comment 1&2. {quote} * What's the deal with TestNamenodeCapacityReport?{quote} Before the patch, every live DataNode will report at least 1 xceriver to NameNode, but it's not actually the real xceiver, it's just the \{{DataXceiverServer}} thread, which is added to \{{threadGroup}} during the initialization of DataNode. After the patch, the xceiver count will be the *real* number of data-transfer thread, so when the Datanode is idle, it will report 0 xceiver count to NameNode. {\{TestNamenodeCapacityReport}} assumes there would be at least 1 xceiver for each DataNode, it leverage this to check the number of cluster's live node, so we need to update the test with this patch. > Fix DataNode's xceiver count calculation > > > Key: HDFS-12288 > URL: https://issues.apache.org/jira/browse/HDFS-12288 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Chen Zhang >Priority: Major > Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, > HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, > HDFS-12288.006.patch > > > The problem with the ThreadGroup.activeCount() method is that the method is > only a very rough estimate, and in reality returns the total number of > threads in the thread group as opposed to the threads actually running. > In some DNs, we saw this to return 50~ for a long time, even though the > actual number of DataXceiver threads was next to none. > This is a big issue as we use the xceiverCount to make decisions on the NN > for choosing replication source DN or returning DNs to clients for R/W. > The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value > which only accounts for actual number of DataXcevier threads currently > running and thus represents the load on the DN much better. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969854#comment-16969854 ] Hadoop QA commented on HDFS-14928: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 10s{color} | {color:red} HDFS-14928 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14928 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28277/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, > NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969853#comment-16969853 ] Xieming Li commented on HDFS-14928: --- [~elgoiri] Thank you for your detailed review, I have modified the variable names in hadoop.css to match naming convention. I have also conducted manual test which you can find in the following screenshots: !HDFS-14892-2.jpg! > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, > NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14928: -- Attachment: HDFS-14892-2.jpg > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14892-2.jpg, HDFS-14928.001.patch, HDFS-14928.002.patch, > HDFS-14928.003.patch, HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, > NN_wo_legend.png, RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14928: -- Status: Open (was: Patch Available) > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14928.001.patch, HDFS-14928.002.patch, HDFS-14928.003.patch, > HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, NN_wo_legend.png, > RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14928) UI: unifying the WebUI across different components.
[ https://issues.apache.org/jira/browse/HDFS-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14928: -- Attachment: HDFS-14928.003.patch Status: Patch Available (was: Open) > UI: unifying the WebUI across different components. > --- > > Key: HDFS-14928 > URL: https://issues.apache.org/jira/browse/HDFS-14928 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: DN_orig.png, DN_with_legend.png.png, DN_wo_legend.png, > HDFS-14928.001.patch, HDFS-14928.002.patch, HDFS-14928.003.patch, > HDFS-14928.jpg, NN_orig.png, NN_with_legend.png, NN_wo_legend.png, > RBF_orig.png, RBF_with_legend.png, RBF_wo_legend.png > > > The WebUI of different components could be unified. > *Router:* > |Current| !RBF_orig.png|width=500! | > |Proposed 1 (With Icon) | !RBF_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)|!RBF_with_legend.png|width=500! | > *NameNode:* > |Current| !NN_orig.png|width=500! | > |Proposed 1 (With Icon) | !NN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !NN_with_legend.png|width=500! | > *DataNode:* > |Current| !DN_orig.png|width=500! | > |Proposed 1 (With Icon) | !DN_wo_legend.png|width=500! | > |Proposed 2 (With Icon and Legend)| !DN_with_legend.png.png|width=500! | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969850#comment-16969850 ] Ayush Saxena commented on HDFS-14966: - Thanx [~zgw] for the report. Seems not only negative but the block pool used and DFS Used is also coming undefined. Did you check the JMX values, for capacity used and capacity remaining, what are they? Did you check the Datanodes pages if storage stats are correct there? Could be an issue with the Namenode calculating stats combining report from 10K datanodes. Any LOG evidences at the namenode? Is the namenode able to perform all write operations? > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image1.png, image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2399) Update mailing list information in CONTRIBUTION and README files
[ https://issues.apache.org/jira/browse/HDDS-2399?focusedWorklogId=340359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340359 ] ASF GitHub Bot logged work on HDDS-2399: Author: ASF GitHub Bot Created on: 08/Nov/19 05:36 Start Date: 08/Nov/19 05:36 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #126: HDDS-2399. Update mailing list information. URL: https://github.com/apache/hadoop-ozone/pull/126 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 340359) Time Spent: 20m (was: 10m) > Update mailing list information in CONTRIBUTION and README files > > > Key: HDDS-2399 > URL: https://issues.apache.org/jira/browse/HDDS-2399 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > We have new mailing lists: > [ozone-...@hadoop.apache.org|mailto:ozone-...@hadoop.apache.org] > [ozone-iss...@hadoop.apache.org|mailto:ozone-iss...@hadoop.apache.org] > [ozone-comm...@hadoop.apache.org|mailto:ozone-comm...@hadoop.apache.org] > > We need to update CONTRIBUTION.md and README.md to use ozone-dev instead of > hdfs-dev (optionally we can mention the issues/commits lists, but only in > CONTRIBUTION.md) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2399) Update mailing list information in CONTRIBUTION and README files
[ https://issues.apache.org/jira/browse/HDDS-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-2399. -- Fix Version/s: 0.5.0 Resolution: Fixed > Update mailing list information in CONTRIBUTION and README files > > > Key: HDDS-2399 > URL: https://issues.apache.org/jira/browse/HDDS-2399 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We have new mailing lists: > [ozone-...@hadoop.apache.org|mailto:ozone-...@hadoop.apache.org] > [ozone-iss...@hadoop.apache.org|mailto:ozone-iss...@hadoop.apache.org] > [ozone-comm...@hadoop.apache.org|mailto:ozone-comm...@hadoop.apache.org] > > We need to update CONTRIBUTION.md and README.md to use ozone-dev instead of > hdfs-dev (optionally we can mention the issues/commits lists, but only in > CONTRIBUTION.md) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969847#comment-16969847 ] Bharat Viswanadham edited comment on HDDS-2356 at 11/8/19 5:26 AM: --- I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR. I have fixed MISMATCH_MULTIPART_LIST in HDDS-2395. was (Author: bharatviswa): I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR. I have fixed MISMATCH_ERROR in HDDS-2395. > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at >
[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969847#comment-16969847 ] Bharat Viswanadham commented on HDDS-2356: -- I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR. I have fixed MISMATCH_ERROR in HDDS-2395. > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66) > at
[jira] [Resolved] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.
[ https://issues.apache.org/jira/browse/HDDS-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-2395. -- Fix Version/s: 0.5.0 Resolution: Fixed > Handle Ozone S3 completeMPU to match with aws s3 behavior. > -- > > Key: HDDS-2395 > URL: https://issues.apache.org/jira/browse/HDDS-2395 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > # When uploaded 2 parts, and when complete upload 1 part no error > # During complete multipart upload name/part number not matching with > uploaded part and part number then InvalidPart error > # When parts are not specified in sorted order InvalidPartOrder > # During complete multipart upload when no uploaded parts, and we specify > some parts then also InvalidPart > # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error) > # When part 3 uploaded, complete with part 3 can be done -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.
[ https://issues.apache.org/jira/browse/HDDS-2395?focusedWorklogId=340356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340356 ] ASF GitHub Bot logged work on HDDS-2395: Author: ASF GitHub Bot Created on: 08/Nov/19 05:23 Start Date: 08/Nov/19 05:23 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #109: HDDS-2395. Handle completeMPU scenarios to match with aws s3 behavior. URL: https://github.com/apache/hadoop-ozone/pull/109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 340356) Time Spent: 20m (was: 10m) > Handle Ozone S3 completeMPU to match with aws s3 behavior. > -- > > Key: HDDS-2395 > URL: https://issues.apache.org/jira/browse/HDDS-2395 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > # When uploaded 2 parts, and when complete upload 1 part no error > # During complete multipart upload name/part number not matching with > uploaded part and part number then InvalidPart error > # When parts are not specified in sorted order InvalidPartOrder > # During complete multipart upload when no uploaded parts, and we specify > some parts then also InvalidPart > # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error) > # When part 3 uploaded, complete with part 3 can be done -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969841#comment-16969841 ] Hadoop QA commented on HDFS-14963: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 29 unchanged - 1 fixed = 29 total (was 30) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}183m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14963 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985297/HDFS-14963.000.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux e1189a530119 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1
[jira] [Commented] (HDFS-14967) TestWebHDFS - Many test cases are failing in Windows
[ https://issues.apache.org/jira/browse/HDFS-14967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969827#comment-16969827 ] Ayush Saxena commented on HDFS-14967: - Thanx [~prasad-acit] for the report. There are two ways going around. Either we close the cluster in the missing test, or take a step ahead pull the {{cluster}} variable up as class variable and add an {{@After}} block common for all test and close the cluster there. I prefer the later one as it will prevent the cluster being open on account of a test getting timeout. > TestWebHDFS - Many test cases are failing in Windows > - > > Key: HDFS-14967 > URL: https://issues.apache.org/jira/browse/HDFS-14967 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > > In TestWebHDFS test class, few test cases are not closing the MiniDFSCluster, > which results in remaining test failures in Windows. Once cluster status is > open, all consecutive test cases fail to get the lock on Data dir which > results in test case failure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969824#comment-16969824 ] bright.zhou commented on HDFS-14966: [~hemanthboyina] hvae updated > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image1.png, image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright.zhou updated HDFS-14966: --- Attachment: image1.png > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image1.png, image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12288) Fix DataNode's xceiver count calculation
[ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated HDFS-12288: -- Attachment: HDFS-12288.006.patch > Fix DataNode's xceiver count calculation > > > Key: HDFS-12288 > URL: https://issues.apache.org/jira/browse/HDFS-12288 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Chen Zhang >Priority: Major > Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, > HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, > HDFS-12288.006.patch > > > The problem with the ThreadGroup.activeCount() method is that the method is > only a very rough estimate, and in reality returns the total number of > threads in the thread group as opposed to the threads actually running. > In some DNs, we saw this to return 50~ for a long time, even though the > actual number of DataXceiver threads was next to none. > This is a big issue as we use the xceiverCount to make decisions on the NN > for choosing replication source DN or returning DNs to clients for R/W. > The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value > which only accounts for actual number of DataXcevier threads currently > running and thus represents the load on the DN much better. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright.zhou updated HDFS-14966: --- Attachment: image2.png > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright.zhou updated HDFS-14966: --- Attachment: (was: image2.png) > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright.zhou updated HDFS-14966: --- Description: Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive value. (was: !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/d4a2a9aa9dd64a78a7ad6dc0247c0ff0/image.png! !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/c931073513cf410588e16262543943e5/image.png! ) > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image2.png > > > Simulate 10k large cluster test HDFS, NameNode web ui show wrongly nagtive > value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14966) In NameNode Web UI, Some values are shown as negative
[ https://issues.apache.org/jira/browse/HDFS-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright.zhou updated HDFS-14966: --- Attachment: image2.png > In NameNode Web UI, Some values are shown as negative > -- > > Key: HDFS-14966 > URL: https://issues.apache.org/jira/browse/HDFS-14966 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: bright.zhou >Priority: Minor > Attachments: image2.png > > > !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/d4a2a9aa9dd64a78a7ad6dc0247c0ff0/image.png! > !https://wa.vision.huawei.com/vision-file-storage-query/upload/image/l00347347/0/c931073513cf410588e16262543943e5/image.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14720) DataNode shouldn't report block as bad block if the block length is Long.MAX_VALUE.
[ https://issues.apache.org/jira/browse/HDFS-14720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969821#comment-16969821 ] Surendra Singh Lilhore commented on HDFS-14720: --- +1 LGTM [~weichiu], [~hexiaoqiao] any other comment ? > DataNode shouldn't report block as bad block if the block length is > Long.MAX_VALUE. > --- > > Key: HDFS-14720 > URL: https://issues.apache.org/jira/browse/HDFS-14720 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14720.001.patch, HDFS-14720.002.patch > > > {noformat} > 2019-08-11 09:15:58,092 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Can't replicate block > BP-725378529-10.0.0.8-1410027444173:blk_13276745777_1112363330268 because > on-disk length 175085 is shorter than NameNode recorded length > 9223372036854775807.{noformat} > If the block length is Long.MAX_VALUE, means file belongs to this block is > deleted from the namenode and DN got the command after deletion of file. In > this case command should be ignored. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969812#comment-16969812 ] Hudson commented on HDFS-14815: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17618 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17618/]) HDFS-14815. RBF: Update the quota in MountTable when calling setQuota on (ayushsaxena: rev 42fc8884ab9763e8778670f301896bf473ecf1d2) * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/Quota.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterQuota.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestDisableRouterQuota.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterAdminServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-14815: Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:34 AM: cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return an index 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. Of course in all abnormal situations there will be a WARN log. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. was (Author: xudongcao): cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return an index 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at >
[jira] [Commented] (HDFS-14815) RBF: Update the quota in MountTable when calling setQuota on a MountTable src.
[ https://issues.apache.org/jira/browse/HDFS-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969800#comment-16969800 ] Ayush Saxena commented on HDFS-14815: - Committed to trunk. Thanx [~LiJinglun] for the contribution and [~elgoiri] for the review!!! > RBF: Update the quota in MountTable when calling setQuota on a MountTable src. > -- > > Key: HDFS-14815 > URL: https://issues.apache.org/jira/browse/HDFS-14815 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-14815.001.patch, HDFS-14815.002.patch, > HDFS-14815.003.patch, HDFS-14815.004.patch, HDFS-14815.005.patch > > > The method setQuota() can make the remote quota(the quota on real clusters) > inconsistent with the MountTable. I think we have 3 ways to fix it: > # Reject all the setQuota() rpcs if it trys to change the quota of a mount > table. > # Let setQuota() to change the mount table quota. Update the quota on zk > first and then update remote quotas. > # Do nothing. The RouterQuotaUpdateService will finally make all the remote > quota right. We can tolerate short-term inconsistencies. > I'd like option 1 because I want the RouterAdmin to be the only entrance to > update the MountTable. > Option 3 we don't need change anything, but the quota will be inconsistent > for a short-term. The remote quota will be effective immediately and > auto-changed back after a while. User might be confused about the behavior. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:28 AM: cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return an index 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. was (Author: xudongcao): cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return a index of 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at >
[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:27 AM: cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return a index of 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a separate JIRA. was (Author: xudongcao): cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return a index of 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at >
[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:26 AM: cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: simply return a index of 0. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a separate JIRA. was (Author: xudongcao): cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: begin with 1st NN. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at
[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795 ] Xudong Cao commented on HDFS-14963: --- cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. For the convenience of reading, I have uploaded an additional patch besides github PR (they are exactly a same patch). Based on this patch: # The cache directory is configurable by a newly introduced item "dfs.client.failover.cache-active.dir", its default value is ${java.io.tmpdir}, which is /tmp on Linux platform. # Writing/Reading a cache file is under file lock protection, and we use trylock() instead of lock(), so in a high-concurrency scenario, reading/writing cache file will not become the bottleneck. if trylock() failed while reading, it just fall back to what we have today: begin with 1st NN. And if trylock() failed while writing, it simply returns and continues. In fact, I think both these situations should be very rare. # All cache files' mode are manually set to "666", meaning every process can read/write them. # This cache mechanism is robust, regardless of whether the cache file was accidentally deleted or the content was maliciously modified, the readActiveCache() always returns a legal index, and writeActiveCache() will automatically rebuild the cache file on next failover. # We surely have dfs.client.failover.random.order, actually I have used it in the unit test, Zkfc does know who is active NN right now, but it does not have an rpc interface allowing us to get it. and I think an rpc call is much more expensive than reading/writing local files. # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a separate JIRA. > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14590) [SBN Read] Add the document link to the top page
[ https://issues.apache.org/jira/browse/HDFS-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969787#comment-16969787 ] Konstantin Shvachko commented on HDFS-14590: Thanks, this is great. > [SBN Read] Add the document link to the top page > > > Key: HDFS-14590 > URL: https://issues.apache.org/jira/browse/HDFS-14590 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: HDFS-14590.001.patch, HDFS-14590.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun reassigned HDFS-13811: -- Assignee: Jinglun (was: Dibyendu Karmakar) > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2443) Python client/interface for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969771#comment-16969771 ] Li Cheng commented on HDDS-2443: Prototyping with S3 gateway + boto3 now. Reads, Writes and Deletes can be done. Large object read may need some tweak. Only concern is when it's only uploading files to S3, it shows read timeout towards ozone endpoint: ReadTimeoutError: Read timeout on endpoint URL: "http://localhost:9878/ozone-test/./20191011/plc_1570784946653_2774; > Python client/interface for Ozone > - > > Key: HDDS-2443 > URL: https://issues.apache.org/jira/browse/HDDS-2443 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Client >Reporter: Li Cheng >Priority: Major > > Original ideas: > Ozone Client(Python) for Data Science Notebook such as Jupyter. > # Size: Large > # PyArrow: [https://pypi.org/project/pyarrow/] > # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API > Impala uses libhdfs > # How Jupyter iPython work: > [https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html] > # Eco, > Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/] > > Path to try: > 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client > (boto3) > 2. python native RPC > 3. pyarrow + libhdfs, which use the Java client under the hood. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2443) Python client/interface for Ozone
Li Cheng created HDDS-2443: -- Summary: Python client/interface for Ozone Key: HDDS-2443 URL: https://issues.apache.org/jira/browse/HDDS-2443 Project: Hadoop Distributed Data Store Issue Type: New Feature Components: Ozone Client Reporter: Li Cheng Original ideas: Ozone Client(Python) for Data Science Notebook such as Jupyter. # Size: Large # PyArrow: [https://pypi.org/project/pyarrow/] # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs # How Jupyter iPython work: [https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html] # Eco, Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/] Path to try: 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3) 2. python native RPC 3. pyarrow + libhdfs, which use the Java client under the hood. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969756#comment-16969756 ] Íñigo Goiri commented on HDFS-14908: It would be nice to have somebody else to double check. We can give it a week or so. > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, Test.java, > TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
[ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969755#comment-16969755 ] Íñigo Goiri commented on HDFS-12288: This looks good and I think that it covers the initial concerns. A couple minor comments on the test: * TestDataNodeMetrics#387 should have expected first. * Can we check something a little more for the asserts in 410 and 411? If everything is 0 this will still pass, we should have some range. * What's the deal with TestNamenodeCapacityReport? I would like for [~shahrs87] and [~lukmajercak] to double check. > Fix DataNode's xceiver count calculation > > > Key: HDFS-12288 > URL: https://issues.apache.org/jira/browse/HDFS-12288 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Chen Zhang >Priority: Major > Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, > HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch > > > The problem with the ThreadGroup.activeCount() method is that the method is > only a very rough estimate, and in reality returns the total number of > threads in the thread group as opposed to the threads actually running. > In some DNs, we saw this to return 50~ for a long time, even though the > actual number of DataXceiver threads was next to none. > This is a big issue as we use the xceiverCount to make decisions on the NN > for choosing replication source DN or returning DNs to clients for R/W. > The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value > which only accounts for actual number of DataXcevier threads currently > running and thus represents the load on the DN much better. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969752#comment-16969752 ] Jinglun commented on HDFS-14908: Hi [~elgoiri], would you help to commit v05 ? Do we need another review ? > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, Test.java, > TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-14963: -- Attachment: HDFS-14963.000.patch > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-14963: -- Attachment: (was: HDFS-14963.000.patch) > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-14963: -- Attachment: HDFS-14963.000.patch > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-14963.000.patch > > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13613) RegionServer log is flooded with "Execution rejected, Executing in current thread"
[ https://issues.apache.org/jira/browse/HDFS-13613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969719#comment-16969719 ] Duo Zhang commented on HDFS-13613: -- +1 on removing the log. I believe if this happens, it will always flood the log file, as this means we are overloaded, and flood the log file with this message will make things even worse... A better way is to implement a counter for this event, and log periodically about this event with some numbers. Can do this in a follow on issue? > RegionServer log is flooded with "Execution rejected, Executing in current > thread" > -- > > Key: HDFS-13613 > URL: https://issues.apache.org/jira/browse/HDFS-13613 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 > Environment: CDH 5.13, HBase RegionServer, Kerberized, hedged read >Reporter: Wei-Chiu Chuang >Priority: Major > Attachments: > 0001-HDFS-13613-RegionServer-log-is-flooded-with-Executio.patch > > > In the log of a HBase RegionServer with hedged read, we saw the following > message flooding the log file. > {noformat} > 2018-05-19 17:22:55,691 INFO org.apache.hadoop.hdfs.DFSClient: Execution > rejected, Executing in current thread > 2018-05-19 17:22:55,692 INFO org.apache.hadoop.hdfs.DFSClient: Execution > rejected, Executing in current thread > 2018-05-19 17:22:55,695 INFO org.apache.hadoop.hdfs.DFSClient: Execution > rejected, Executing in current thread > 2018-05-19 17:22:55,696 INFO org.apache.hadoop.hdfs.DFSClient: Execution > rejected, Executing in current thread > 2018-05-19 17:22:55,696 INFO org.apache.hadoop.hdfs.DFSClient: Execution > rejected, Executing in current thread > > {noformat} > Sometimes the RS spits tens of thousands of lines of this message in a > minute. We should do something to stop this message flooding the log file. > Also, we should make this message more actionable. Discussed with > [~huaxiang], this message can appear if there are stale DataNodes. > I believe this issue existed since HDFS-5776. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969682#comment-16969682 ] Hadoop QA commented on HDFS-14854: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 45s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 11 new + 464 unchanged - 5 fixed = 475 total (was 469) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 44s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14854 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985278/HDFS-14854.014.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 75453069aebf 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 247584e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28274/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28274/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | |
[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969677#comment-16969677 ] Konstantin Shvachko commented on HDFS-14963: We do have some randomization of the proxies on the client via {{dfs.client.failover.random.order}}. So it does not need to always start from the first one. Caching last active NN for the client could be useful, but using /tmp is not good. With ZKFC you should have that already, right? > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969659#comment-16969659 ] Tsz-wo Sze commented on HDFS-14529: --- Not sure if HDFS-14807 could fix the NPE. > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2404) Add support for Registered id as service identifier for CSR.
[ https://issues.apache.org/jira/browse/HDDS-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2404. Fix Version/s: 0.5.0 Resolution: Fixed Committed to the master. > Add support for Registered id as service identifier for CSR. > > > Key: HDDS-2404 > URL: https://issues.apache.org/jira/browse/HDDS-2404 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The SCM HA needs the ability to represent a group as a single entity. So that > Tokens for each of the OM which is part of an HA group can be honored by the > data nodes. > This patch adds the notion of a service group ID to the Certificate > Infrastructure. In the next JIRAs, we will use this capability when issuing > certificates to OM -- especially when they are in HA mode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2442) Add ServiceName support for Certificate Signing Request.
Anu Engineer created HDDS-2442: -- Summary: Add ServiceName support for Certificate Signing Request. Key: HDDS-2442 URL: https://issues.apache.org/jira/browse/HDDS-2442 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Anu Engineer Assignee: Abhishek Purohit We need to add support for adding Service name into the Certificate Signing Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2404) Add support for Registered id as service identifier for CSR.
[ https://issues.apache.org/jira/browse/HDDS-2404?focusedWorklogId=340231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340231 ] ASF GitHub Bot logged work on HDDS-2404: Author: ASF GitHub Bot Created on: 07/Nov/19 23:32 Start Date: 07/Nov/19 23:32 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #116: HDDS-2404. Added support for Registered id as service identifier for … URL: https://github.com/apache/hadoop-ozone/pull/116 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 340231) Time Spent: 20m (was: 10m) > Add support for Registered id as service identifier for CSR. > > > Key: HDDS-2404 > URL: https://issues.apache.org/jira/browse/HDDS-2404 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The SCM HA needs the ability to represent a group as a single entity. So that > Tokens for each of the OM which is part of an HA group can be honored by the > data nodes. > This patch adds the notion of a service group ID to the Certificate > Infrastructure. In the next JIRAs, we will use this capability when issuing > certificates to OM -- especially when they are in HA mode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2442) Add ServiceName support for Certificate Signing Request.
[ https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2442: --- Parent: HDDS-505 Issue Type: Sub-task (was: Improvement) > Add ServiceName support for Certificate Signing Request. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
[ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969641#comment-16969641 ] Chen Zhang commented on HDFS-12288: --- The failed tests are unrelated. [~elgoiri] do you have time to help review? Thanks. > Fix DataNode's xceiver count calculation > > > Key: HDFS-12288 > URL: https://issues.apache.org/jira/browse/HDFS-12288 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Chen Zhang >Priority: Major > Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, > HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch > > > The problem with the ThreadGroup.activeCount() method is that the method is > only a very rough estimate, and in reality returns the total number of > threads in the thread group as opposed to the threads actually running. > In some DNs, we saw this to return 50~ for a long time, even though the > actual number of DataXceiver threads was next to none. > This is a big issue as we use the xceiverCount to make decisions on the NN > for choosing replication source DN or returning DNs to clients for R/W. > The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value > which only accounts for actual number of DataXcevier threads currently > running and thus represents the load on the DN much better. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2422) Add robot tests for list-trash command.
[ https://issues.apache.org/jira/browse/HDDS-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2422: --- Assignee: Matthew Sharp > Add robot tests for list-trash command. > --- > > Key: HDDS-2422 > URL: https://issues.apache.org/jira/browse/HDDS-2422 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add robot tests for list-trash command and add those tests to integration.sh > so these commands are run as part of CI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2418) Add the list trash command server side handling.
[ https://issues.apache.org/jira/browse/HDDS-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2418: --- Assignee: Matthew Sharp > Add the list trash command server side handling. > > > Key: HDDS-2418 > URL: https://issues.apache.org/jira/browse/HDDS-2418 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add the standard code for any command handling in the server side. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2420) Add the Ozone shell support for list-trash command.
[ https://issues.apache.org/jira/browse/HDDS-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2420: --- Assignee: Matthew Sharp > Add the Ozone shell support for list-trash command. > --- > > Key: HDDS-2420 > URL: https://issues.apache.org/jira/browse/HDDS-2420 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone CLI >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add support for list-trash command in Ozone CLI. Please see the attached > design doc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2419) Add the core logic to process list trash command.
[ https://issues.apache.org/jira/browse/HDDS-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2419: --- Assignee: Matthew Sharp > Add the core logic to process list trash command. > - > > Key: HDDS-2419 > URL: https://issues.apache.org/jira/browse/HDDS-2419 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add the core logic of reading from the deleted table, and return the entries > that match the user query. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2382) Consider reducing number of file::exists() calls during write operation
[ https://issues.apache.org/jira/browse/HDDS-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-2382: - Assignee: Aravindan Vijayan (was: Siddharth Wagle) We do need to verify the behavior if the chunksPath is deleted. One way is to fail late and make sure the behavior is consistent. > Consider reducing number of file::exists() calls during write operation > --- > > Key: HDDS-2382 > URL: https://issues.apache.org/jira/browse/HDDS-2382 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Rajesh Balamohan >Assignee: Aravindan Vijayan >Priority: Major > Labels: performance > > When writing 100-200 MB files with multiple threads, observed lots of > {{[file::exists(])}} checks. > For every 16 MB chunk, it ends up checking whether {{chunksLoc}} directory > exists or not. (ref: > [https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L239]) > Also, this check ({{ChunkUtils.getChunkFile}}) happens from 2 places. > 1.org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$handleWriteChunk > 2.org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction > Note that these are folders and not actual chunk filenames. It would be > helpful to reduce this check, if we track create/delete of these folders. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2421) Add documentation for list trash command.
[ https://issues.apache.org/jira/browse/HDDS-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2421: --- Assignee: Matthew Sharp > Add documentation for list trash command. > - > > Key: HDDS-2421 > URL: https://issues.apache.org/jira/browse/HDDS-2421 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: documentation >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add documentation about the list-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2441) Add documentation for Empty-Trash command.
Anu Engineer created HDDS-2441: -- Summary: Add documentation for Empty-Trash command. Key: HDDS-2441 URL: https://issues.apache.org/jira/browse/HDDS-2441 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: documentation Reporter: Anu Engineer Add documentation for empty-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2440) Add empty-trash to ozone shell.
Anu Engineer created HDDS-2440: -- Summary: Add empty-trash to ozone shell. Key: HDDS-2440 URL: https://issues.apache.org/jira/browse/HDDS-2440 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone CLI Reporter: Anu Engineer Add emptry-trash command to Ozone shell. We should decide if we want to add this to the admin shell or normal shell. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2439) Add robot tests for empty-trash as owner.
Anu Engineer created HDDS-2439: -- Summary: Add robot tests for empty-trash as owner. Key: HDDS-2439 URL: https://issues.apache.org/jira/browse/HDDS-2439 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer We need to make sure that only Owner or Admins can execute the empty-trash command. We need to verify this using end-to-end tests, for example, robot tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2438) Add the core logic for empty-trash
Anu Engineer created HDDS-2438: -- Summary: Add the core logic for empty-trash Key: HDDS-2438 URL: https://issues.apache.org/jira/browse/HDDS-2438 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2437) Restrict empty-trash to admins and owners only
Anu Engineer created HDDS-2437: -- Summary: Restrict empty-trash to admins and owners only Key: HDDS-2437 URL: https://issues.apache.org/jira/browse/HDDS-2437 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Make sure that only the owner of a key/adminstrator can empty-trash. The delete ACL is not enough to empty-trash. This is becasue a shared bucket can have deletes but the owner should be able to recover them. Once empty-trash is executed even the owner will be able to recover the deleted keys -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2436) Add security profile support for empty-trash command
Anu Engineer created HDDS-2436: -- Summary: Add security profile support for empty-trash command Key: HDDS-2436 URL: https://issues.apache.org/jira/browse/HDDS-2436 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add support for a certain groups to have the ability to have empty-trash. It might be the case where we want this command only to be run by admins. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2417) Add the list trash command to the client side
[ https://issues.apache.org/jira/browse/HDDS-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp reassigned HDDS-2417: --- Assignee: Matthew Sharp > Add the list trash command to the client side > - > > Key: HDDS-2417 > URL: https://issues.apache.org/jira/browse/HDDS-2417 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Anu Engineer >Assignee: Matthew Sharp >Priority: Major > > Add the list-trash command to the protobuf files and to the client side > translator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2435) Add the ability to disable empty-trash command.
Anu Engineer created HDDS-2435: -- Summary: Add the ability to disable empty-trash command. Key: HDDS-2435 URL: https://issues.apache.org/jira/browse/HDDS-2435 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Anu Engineer Add a configuration key to disable empty-trash command. We can discuss if this should be a system-wide setting or per bucket. It is easier to do this system-wide I guess. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2434) Add server side support for empty-trash command.
Anu Engineer created HDDS-2434: -- Summary: Add server side support for empty-trash command. Key: HDDS-2434 URL: https://issues.apache.org/jira/browse/HDDS-2434 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add server side support for empty-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2433) Add client side support for the empty-trash command.
Anu Engineer created HDDS-2433: -- Summary: Add client side support for the empty-trash command. Key: HDDS-2433 URL: https://issues.apache.org/jira/browse/HDDS-2433 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add client side support for the empty-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2432) Add documentation for the recover-trash
Anu Engineer created HDDS-2432: -- Summary: Add documentation for the recover-trash Key: HDDS-2432 URL: https://issues.apache.org/jira/browse/HDDS-2432 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: documentation Reporter: Anu Engineer Add documentation for the recover-trash command in Ozone Documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2431) Add recover-trash command to the ozone shell.
Anu Engineer created HDDS-2431: -- Summary: Add recover-trash command to the ozone shell. Key: HDDS-2431 URL: https://issues.apache.org/jira/browse/HDDS-2431 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone CLI Reporter: Anu Engineer Add recover-trash command to the Ozone CLI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
[ https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham reassigned HDDS-2427: Assignee: Bharat Viswanadham > Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar > - > > Key: HDDS-2427 > URL: https://issues.apache.org/jira/browse/HDDS-2427 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This has caused issue for DN UI loading. > hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which > accidentally loaded Ozone datanode web application instead of Hadoop datanode > application. This leads to the reported error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
[ https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2427: - Labels: pull-request-available (was: ) > Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar > - > > Key: HDDS-2427 > URL: https://issues.apache.org/jira/browse/HDDS-2427 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > > This has caused issue for DN UI loading. > hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which > accidentally loaded Ozone datanode web application instead of Hadoop datanode > application. This leads to the reported error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
[ https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2427: - Status: Patch Available (was: Open) > Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar > - > > Key: HDDS-2427 > URL: https://issues.apache.org/jira/browse/HDDS-2427 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This has caused issue for DN UI loading. > hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which > accidentally loaded Ozone datanode web application instead of Hadoop datanode > application. This leads to the reported error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
[ https://issues.apache.org/jira/browse/HDDS-2427?focusedWorklogId=340211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340211 ] ASF GitHub Bot logged work on HDDS-2427: Author: ASF GitHub Bot Created on: 07/Nov/19 22:44 Start Date: 07/Nov/19 22:44 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #129: HDDS-2427. Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar URL: https://github.com/apache/hadoop-ozone/pull/129 ## What changes were proposed in this pull request? Exclude web apps from filesystem uber jar. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2427 ## How was this patch tested? This has caused DN UI loading, now with placing the fixed jar UI is able to load up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 340211) Remaining Estimate: 0h Time Spent: 10m > Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar > - > > Key: HDDS-2427 > URL: https://issues.apache.org/jira/browse/HDDS-2427 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This has caused issue for DN UI loading. > hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which > accidentally loaded Ozone datanode web application instead of Hadoop datanode > application. This leads to the reported error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2430) Recover-trash should warn and skip if at-rest encryption is enabled and keys are missing.
Anu Engineer created HDDS-2430: -- Summary: Recover-trash should warn and skip if at-rest encryption is enabled and keys are missing. Key: HDDS-2430 URL: https://issues.apache.org/jira/browse/HDDS-2430 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer If TDE is enabled, recovering a key is useful only if the actual keys that are used for encryption are still recoverable. We should warn and fail the recovery if the actual keys are missing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2429) Recover-trash should warn and skip if the key is GDPR-ed key that recovery is pointless since the encryption keys are lost.
Anu Engineer created HDDS-2429: -- Summary: Recover-trash should warn and skip if the key is GDPR-ed key that recovery is pointless since the encryption keys are lost. Key: HDDS-2429 URL: https://issues.apache.org/jira/browse/HDDS-2429 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Anu Engineer If a bucket has GDPR enabled set, then it means that keys used to recover the data from the blocks is irrecoverably lost. In that case, a recover from trash is pointless. The recover-trash command should detect this case and let the users know about it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2428) Rename a recovered file as .recovered if the file already exists in the target bucket.
Anu Engineer created HDDS-2428: -- Summary: Rename a recovered file as .recovered if the file already exists in the target bucket. Key: HDDS-2428 URL: https://issues.apache.org/jira/browse/HDDS-2428 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Anu Engineer During recovery if the file name exists in the bucket, then the new key that is being recovered should be automatically renamed. The proposal is to rename it as key.recovered. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2426) Support recover-trash to an existing bucket.
Anu Engineer created HDDS-2426: -- Summary: Support recover-trash to an existing bucket. Key: HDDS-2426 URL: https://issues.apache.org/jira/browse/HDDS-2426 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Support recovering trash to an existing bucket. We should also add a config key that prevents this mode, so admins can force the recovery to a new bucket always. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
Bharat Viswanadham created HDDS-2427: Summary: Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar Key: HDDS-2427 URL: https://issues.apache.org/jira/browse/HDDS-2427 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham This has caused issue for DN UI loading. hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which accidentally loaded Ozone datanode web application instead of Hadoop datanode application. This leads to the reported error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2425) Support the ability to recover-trash to a new bucket.
Anu Engineer created HDDS-2425: -- Summary: Support the ability to recover-trash to a new bucket. Key: HDDS-2425 URL: https://issues.apache.org/jira/browse/HDDS-2425 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer recover-trash can be run to recover to an existing bucket or to a new bucket. If the bucket does not exist, the recover-trash command should create that bucket automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2424) Add the recover-trash command server side handling.
Anu Engineer created HDDS-2424: -- Summary: Add the recover-trash command server side handling. Key: HDDS-2424 URL: https://issues.apache.org/jira/browse/HDDS-2424 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add the standard server side code for command handling. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2423) Add the recover-trash command client side code
[ https://issues.apache.org/jira/browse/HDDS-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2423: --- Description: Add protobuf, RpcClient and ClientSideTranslator code for the recover-trash command. (was: Add protobuf, RpcClient and ClientSideTranslator code for the Empty-trash command.) > Add the recover-trash command client side code > -- > > Key: HDDS-2423 > URL: https://issues.apache.org/jira/browse/HDDS-2423 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Anu Engineer >Priority: Major > > Add protobuf, RpcClient and ClientSideTranslator code for the recover-trash > command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2423) Add the recover-trash command client side code
Anu Engineer created HDDS-2423: -- Summary: Add the recover-trash command client side code Key: HDDS-2423 URL: https://issues.apache.org/jira/browse/HDDS-2423 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add protobuf, RpcClient and ClientSideTranslator code for the Empty-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2422) Add robot tests for list-trash command.
Anu Engineer created HDDS-2422: -- Summary: Add robot tests for list-trash command. Key: HDDS-2422 URL: https://issues.apache.org/jira/browse/HDDS-2422 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: test Reporter: Anu Engineer Add robot tests for list-trash command and add those tests to integration.sh so these commands are run as part of CI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2421) Add documentation for list trash command.
Anu Engineer created HDDS-2421: -- Summary: Add documentation for list trash command. Key: HDDS-2421 URL: https://issues.apache.org/jira/browse/HDDS-2421 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: documentation Reporter: Anu Engineer Add documentation about the list-trash command. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2420) Add the Ozone shell support for list-trash command.
Anu Engineer created HDDS-2420: -- Summary: Add the Ozone shell support for list-trash command. Key: HDDS-2420 URL: https://issues.apache.org/jira/browse/HDDS-2420 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone CLI Reporter: Anu Engineer Add support for list-trash command in Ozone CLI. Please see the attached design doc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2418) Add the list trash command server side handling.
Anu Engineer created HDDS-2418: -- Summary: Add the list trash command server side handling. Key: HDDS-2418 URL: https://issues.apache.org/jira/browse/HDDS-2418 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add the standard code for any command handling in the server side. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2419) Add the core logic to process list trash command.
Anu Engineer created HDDS-2419: -- Summary: Add the core logic to process list trash command. Key: HDDS-2419 URL: https://issues.apache.org/jira/browse/HDDS-2419 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Anu Engineer Add the core logic of reading from the deleted table, and return the entries that match the user query. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org