[jira] [Assigned] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B reassigned HDFS-15790: Assignee: Vinayakumar B (was: David Mollitor) > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: Vinayakumar B >Priority: Critical > Labels: pull-request-available, release-blocker > Fix For: 3.3.1, 3.4.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17300087#comment-17300087 ] Vinayakumar B commented on HDFS-15790: -- Created the PR [https://github.com/apache/hadoop/pull/2767] with above changes. Please review. Thanks > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17274396#comment-17274396 ] Vinayakumar B commented on HDFS-15790: -- Adding a new RpcKind makes it difficult to maintain multiple implementations of the server side protocol to support the same functionality. Because its equally important to serve requests of older clients which still send the requests with RpcKind.PROTOCOL_BUFFERS. Instead, I have an approach where ProtobufRpcEngine and ProtobufRpcEngine2 can co-exist. ProtobufRpcEngine: Supports existing implementations based in protobuf 2.5.0 in both client side and server side. No code changes required in downstreams use this. ProtobufRpcEngine2: Uses shaded protobuf of 3.7.1 version and supports client side and server side implementations based on shaded protobuf 3.7.1 In the below change, ProtobufRpcEngine2 itself will handle both versions of requests for RpcKind.PROTOCOL_BUFFERS. ProtobufRpcEngine2 will handover the processing to ProtobufRpcEngine if implemenation found to be using older version of protobuf (2.5.0). So no conflict is raised for co-existence. Please verify this change if possible. [https://github.com/vinayakumarb/hadoop/tree/bugs/HDFS-15790] > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273357#comment-17273357 ] Vinayakumar B commented on HDFS-15790: -- Thanks for reporting this issue [~belugabehr]. please check the history of HADOOP-13363 for details regarding why and how the upgrade was done. I will try to review proposed changes this weekend. Thanks. > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238558#comment-17238558 ] Vinayakumar B commented on HDFS-15660: -- Hi [~jianliang.wu] , are you working on the fix? if not, could you assign me the jira, I have the fix. Thanks. > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223033#comment-17223033 ] Vinayakumar B commented on HDFS-15624: -- No need to hold this Jira and No need to revert HDFS-15025 as long as all of them (including HDFS-15660) lands up before release of 3.4.0. In any case, if not able to fix any of these by the release time (which I think we still have some time), then we can think of revert. Right now, PR is handling the backward compatibility related issues (due to change in StorageType order) and inclusion of new Storage policy, by bumping the LayoutVersion and adding check against NVDIMM releated operations to block during upgrade. HDFS-15660 will be handled soon enough to solve issues of both PROVIDED and NVDIMM in a generic way. Right now also, 2.x clients will not be able to talk 3.x namenode via {{getContentSummary()}} or {{getQuotaUsage()}} on directories with Quota on Storage Types. > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 5h 40m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-15660: - Fix Version/s: (was: 3.4.0) Target Version/s: 2.9.3, 3.4.0, 3.1.5, 2.10.2, 3.2.3 > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To
[jira] [Updated] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-15660: - Target Version/s: 2.9.3, 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3 (was: 2.9.3, 3.4.0, 3.1.5, 2.10.2, 3.2.3) > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-15098: - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Release Note: New encryption codec "SM4/CTR/NoPadding" is added. Requires openssl version >=1.1.1 for native implementation. Resolution: Fixed Status: Resolved (was: Patch Available) Merged to trunk. Thanks everyone. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Fix For: 3.4.0 > > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 20m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200295#comment-17200295 ] Vinayakumar B commented on HDFS-15098: -- +1, test failures and other things are unrelated. Will wait for one/two days before commit, if any one needs to take a look. Thanks [~seanlau] for the update on patch. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 20m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197405#comment-17197405 ] Vinayakumar B commented on HDFS-15098: -- Thanks [~seanlau] for the update. Only one nit from my previous comments is misssed. Remove the following unused method.. {code} public void log(GeneralSecurityException e) { } {code} and Please check about the checkstyle, javac and cc warnings. Since previous yetus run reports are not available I will re-trigger the Jenkins. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 20m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185104#comment-17185104 ] Vinayakumar B commented on HDFS-15098: -- Hi [~seanlau], Please check comments 1. Change the name {{JceCtrcryptoCodec}} to {{JceCtrCryptoCodec}} 2. No need to have following redundant constants. Instead directly use the original constants itself. For example, instead of using {{AES_BLOCK_SIZE}} use {{CipherSuite.AES_CTR_NOPADDING.getAlgorithmBlockSize()}} {code:java} protected static final CipherSuite AES_SUITE = CipherSuite.AES_CTR_NOPADDING; protected static final CipherSuite SM4_SUITE = CipherSuite.SM4_CTR_NOPADDING; protected static final int AES_BLOCK_SIZE = AES_SUITE.getAlgorithmBlockSize(); protected static final int SM4_BLOCK_SIZE = SM4_SUITE.getAlgorithmBlockSize(); {code} Example in JceAesCtrCryptoCodec {code:java} @Override public CipherSuite getCipherSuite() { return CipherSuite.AES_CTR_NOPADDING; } @Override public void calculateIV(byte[] initIV, long counter, byte[] iv) { super.calculateIV(initIV, counter, iv, getCipherSuite().getAlgorithmBlockSize()); } @Override public Encryptor createEncryptor() throws GeneralSecurityException { return new JceCtrCipher(Cipher.ENCRYPT_MODE, getProvider(), getCipherSuite(), "AES"); } @Override public Decryptor createDecryptor() throws GeneralSecurityException { return new JceCtrCipher(Cipher.DECRYPT_MODE, getProvider(), getCipherSuite(), "AES"); } {code} 3. In JceCtrCryptoCodec, Instead of following method, {code:java} public void log(GeneralSecurityException e) { } {code} make an abstract method to get Logger and use that logger to log message in JceCtrCryptoCodec.java {code:java} protected abstract Logger getLogger(); {code} and {code:java} } catch(GeneralSecurityException e) { getLogger().warn(e.getMessage()); random = new SecureRandom(); } {code} In JceAesCtrCryptoCodec and JceSm4CtrCryptoCodec {code:java} @Override public Logger getLogger() { return LOG; } {code} 4. Similar to #2 and #3, make the changes in OpensslCtrCryptoCodec, OpensslAesCtrCryptoCodec and OpensslSm4CtrCryptoCodec. Instead of {{useLog}} use {{getLogger()}} method to get the corresponding logger and move following to corresponding classes, make them private and return them in {{getLogger()}}. {code:java} protected static final Logger SM4_LOG = LoggerFactory.getLogger(OpensslSm4CtrCryptoCodec.class.getName()); protected static final Logger AES_LOG = LoggerFactory.getLogger(OpensslAesCtrCryptoCodec.class.getName()); {code} 5. In OpensslCtrCryptoCodec, initialization of engineId is done only in case of SM4. So this part of code can be moved to {{OpensslSm4CtrCryptoCodec#setConf()}} as below. {code:java} @Override public void setConf(Configuration conf) { super.setConf(conf); setEngineId(conf.get(HADOOP_SECURITY_OPENSSL_ENGINE_ID_KEY)); } {code} 6. {{OpensslCtrCryptoCodec#close()}} can be changed as below {code:java} public void close() throws IOException { if (this.random instanceof Closeable) { Closeable r = (Closeable) this.random; IOUtils.cleanupWithLogger(getLogger(), r); } } {code} 7. Revert unnecessary changes in HdfsKMSUtil.java 8. in KeyProvider constructor move addition of BouncyCastleProvider out of if block as following {code:java} public KeyProvider(Configuration conf) { this.conf = new Configuration(conf); // Added for HADOOP-15473. Configured serialFilter property fixes // java.security.UnrecoverableKeyException in JDK 8u171. if(System.getProperty(JCEKS_KEY_SERIAL_FILTER) == null) { String serialFilter = conf.get(HADOOP_SECURITY_CRYPTO_JCEKS_KEY_SERIALFILTER, JCEKS_KEY_SERIALFILTER_DEFAULT); System.setProperty(JCEKS_KEY_SERIAL_FILTER, serialFilter); } String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } } {code} 9. Right now all tests of {{TestCryptoStreamsWithJceSm4CtrCryptoCodec}} are getting skipped. Do following changes to get it fixed. Remove below code in {{TestCryptoStreamsWithJceSm4CtrCryptoCodec#init()}} {code:java} try { KeyGenerator keyGenerator = KeyGenerator.getInstance("SM4"); } catch (Exception e) { Assume.assumeTrue(false); } {code} Add Following config in {{TestCryptoStreamsWithJceSm4CtrCryptoCodec#init()}} {code:java} conf.set(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, BouncyCastleProvider.PROVIDER_NAME); {code} 10. To avoid javac warnings in {{TestCryptoStreamsWithOpensslSm4CtrCryptoCodec}} and {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}}
[jira] [Comment Edited] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154413#comment-17154413 ] Vinayakumar B edited comment on HDFS-15098 at 7/9/20, 11:19 AM: Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor. {code:java} provider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, BouncyCastleProvider.PROVIDER_NAME); final String secureRandomAlg = conf.get( HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY, HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT); if (BouncyCastleProvider.PROVIDER_NAME.equals(provider)) { Security.addProvider(new BouncyCastleProvider()); } {code} 2. With Above change, {{TestKeyShell#testInvalidKeySize()}} will not fail anymore, as BC provider will not be added by default. So changes in {{TestKeyShell}} can be reverted. 3. In {{TestCryptoCodec.java}} Remove these lines from every test. {code:java} try { KeyGenerator keyGenerator = KeyGenerator.getInstance("SM4"); } catch (Exception e) { Assume.assumeTrue(false); } {code} 4. In {{TestCryptoCodec#testJceSm4CtrCryptoCodec}} change this config as below. {code:java} conf.set(HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_SM4_CTR_NOPADDING_KEY, JceSm4CtrCryptoCodec.class.getName());{code} Uncomment following lines {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} 5. Avoid import statements with * in all classes. import only required classes directly. 6. {{HdfsKMSUtil.getCryptoCodec()}} is not logging {{JceSm4CTRCodec}}. May be can log all classnames, when its not null without checking the instanceof ? 7. I can see lot of code is same between AES and SM4 codecs, except the classnames and algorithm names. May be refactoring would help to reduce the duplicate code. 8. I think in {{hdfs.proto}} SM4 enum value can be changed to 3 directly. {code}enum CipherSuiteProto { UNKNOWN = 1; AES_CTR_NOPADDING = 2; SM4_CTR_NOPADDING = 3; }{code} 9. In {{OpenSecureRandom.c}} following functions' declarations and definitions can be kept within {{OPENSSL_VERSION_NUMBER < 0x1010L}} block. i.e. following fuctions should be used only when {{OPENSSL_VERSION_NUMBER < 0x1010L}} is true: {code} static void locks_setup(void) static void locks_cleanup(void) static void pthreads_locking_callback(int mode, int type, char *file, int line) static unsigned long pthreads_thread_id(void) {code} was (Author: vinayrpet): Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor.
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154413#comment-17154413 ] Vinayakumar B commented on HDFS-15098: -- Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor. {code:java} provider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, BouncyCastleProvider.PROVIDER_NAME); final String secureRandomAlg = conf.get( HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY, HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT); if (BouncyCastleProvider.PROVIDER_NAME.equals(provider)) { Security.addProvider(new BouncyCastleProvider()); } {code} 2. With Above change, {{TestKeyShell#testInvalidKeySize()}} will not fail anymore, as BC provider will not be added by default. So changes in {{TestKeyShell}} can be reverted. 3. In {{TestCryptoCodec.java}} Remove these lines from every test. {code:java} try { KeyGenerator keyGenerator = KeyGenerator.getInstance("SM4"); } catch (Exception e) { Assume.assumeTrue(false); } {code} 4. In {{TestCryptoCodec#testJceSm4CtrCryptoCodec}} change this config as below. {code:java} conf.set(HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_SM4_CTR_NOPADDING_KEY, JceSm4CtrCryptoCodec.class.getName());{code} Uncomment following lines {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} 5. Avoid import statements with * in all classes. import only required classes directly. 6. {{HdfsKMSUtil.getCryptoCodec()}} is not logging {{JceSm4CTRCodec}}. May be can log all classnames, when its not null without checking the instanceof ? 7. I can see lot of code is same between AES and SM4 codecs, except the classnames and algorithm names. May be refactoring would help to reduce the duplicate code. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: zZtai >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.download Bouncy Castle Crypto APIs from bouncycastle.org > [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar] > 2.Configure JDK > Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory, > add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" > to $JAVA_HOME/jre/lib/security/java.security file > 3.Configure Hadoop KMS > 4.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 > 2.configure Bouncy Castle Crypto on JDK -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Commented] (HDFS-15359) EC: Allow closing a file with committed blocks
[ https://issues.apache.org/jira/browse/HDFS-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17125205#comment-17125205 ] Vinayakumar B commented on HDFS-15359: -- +1 thanks [~ayushtkn] for contribution > EC: Allow closing a file with committed blocks > -- > > Key: HDFS-15359 > URL: https://issues.apache.org/jira/browse/HDFS-15359 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15359-01.patch, HDFS-15359-02.patch, > HDFS-15359-03.patch, HDFS-15359-04.patch, HDFS-15359-05.patch > > > Presently, {{dfs.namenode.file.close.num-committed-allowed}} is ignored in > case of EC blocks. But in case of heavy loads, IBR's from Datanode may get > delayed and cause the file write to fail. So, can allow EC files to close > with blocks in committed state as REP files -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15359) EC: Allow closing a file with committed blocks
[ https://issues.apache.org/jira/browse/HDFS-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120484#comment-17120484 ] Vinayakumar B commented on HDFS-15359: -- Thanks [~ayushtkn] for the patch. I think the approach of allowing commited block only in case of write happened to all nodes is very reasonable to prevent unexpected dataloss. 2 minor comments {code} if (b.isStriped()) { BlockInfoStriped blkStriped = (BlockInfoStriped) b; if (b.getUnderConstructionFeature().getExpectedStorageLocations().length != blkStriped.getRealTotalBlockNum()) { return b + " is a striped block in " + state + " with less then " + "required number of blocks."; } } {code} Move this check after `if (state != BlockUCState.COMMITTED) ` check. It makes more sense there. In test, {code} // Check if the blockgroup isn't complete then file close shouldn't be // success with block in committed state. cluster.getDataNodes().get(0).shutdown(); FSDataOutputStream str = dfs.create(new Path("/dir/file1")); for (int i = 0; i < 1024 * 1024 * 4; i++) { str.write(i); } DataNodeTestUtils.pauseIBR(cluster.getDataNodes().get(0)); DataNodeTestUtils.pauseIBR(cluster.getDataNodes().get(1)); LambdaTestUtils.intercept(IOException.class, "", () -> str.close()); {code} You should `pauseIBR` datanodes 1 and 2. 0 is already shutdown. +1 once addessed. > EC: Allow closing a file with committed blocks > -- > > Key: HDFS-15359 > URL: https://issues.apache.org/jira/browse/HDFS-15359 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15359-01.patch, HDFS-15359-02.patch, > HDFS-15359-03.patch, HDFS-15359-04.patch > > > Presently, {{dfs.namenode.file.close.num-committed-allowed}} is ignored in > case of EC blocks. But in case of heavy loads, IBR's from Datanode may get > delayed and cause the file write to fail. So, can allow EC files to close > with blocks in committed state as REP files -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14999) Avoid Potential Infinite Loop in DFSNetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108986#comment-17108986 ] Vinayakumar B commented on HDFS-14999: -- +1 > Avoid Potential Infinite Loop in DFSNetworkTopology > --- > > Key: HDFS-14999 > URL: https://issues.apache.org/jira/browse/HDFS-14999 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14999-01.patch > > > {code:java} > do { > chosen = chooseRandomWithStorageTypeAndExcludeRoot(root, excludeRoot, > type); > if (excludedNodes == null || !excludedNodes.contains(chosen)) { > break; > } else { > LOG.debug("Node {} is excluded, continuing.", chosen); > } > } while (true); > {code} > Observed this loop getting stuck as part of testing HDFS-14913. > There should be some exit condition or max retries here -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14999) Avoid Potential Infinite Loop in DFSNetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102653#comment-17102653 ] Vinayakumar B commented on HDFS-14999: -- changes looks fine to me. It would be better to have a benchmark done on this. Perhaps the you can check \{{TestDFSNetworkTopologyPerformance}} class for changes with excluded nodes. Results before and after patch would help. > Avoid Potential Infinite Loop in DFSNetworkTopology > --- > > Key: HDFS-14999 > URL: https://issues.apache.org/jira/browse/HDFS-14999 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14999-01.patch > > > {code:java} > do { > chosen = chooseRandomWithStorageTypeAndExcludeRoot(root, excludeRoot, > type); > if (excludedNodes == null || !excludedNodes.contains(chosen)) { > break; > } else { > LOG.debug("Node {} is excluded, continuing.", chosen); > } > } while (true); > {code} > Observed this loop getting stuck as part of testing HDFS-14913. > There should be some exit condition or max retries here -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B reassigned HDFS-15098: Assignee: zZtai > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: liusheng >Assignee: zZtai >Priority: Major > Attachments: HDFS-15098.001.patch > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15187) CORRUPT replica mismatch between namenodes after failover
[ https://issues.apache.org/jira/browse/HDFS-15187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041561#comment-17041561 ] Vinayakumar B commented on HDFS-15187: -- Thanks for the nice catch [~ayushtkn]. Changes looks fine to me. +1, Pending typo change. Please confirm about test failures as well. > CORRUPT replica mismatch between namenodes after failover > - > > Key: HDFS-15187 > URL: https://issues.apache.org/jira/browse/HDFS-15187 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Attachments: HDFS-15187-01.patch, HDFS-15187-02.patch > > > The corrupt replica identified by Active Namenode, isn't identified by the > Other Namenode, when it is failovered to Active, in case the replica is being > marked corrupt due to updatePipeline. > Scenario to repro : > 1. Create a file, while writing turn one datanode down, to trigger update > pipeline. > 2. Write some more data. > 3. Close the file. > 4. Turn on the shutdown datanode. > 5. The replica in the datanode will be identifed as CORRUPT and the corrupt > count will be 1. > 6. Failover to other Namenode. > 7. Wait for all pending IBR processing. > 8. The corrupt count will not be same, and the FSCK won't show the corrupt > replica. > 9. Failover back to first namenode. > 10. Corrupt count and corrupt replica will be there. > Both Namenodes shows different stuff. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13739) Option to disable Rack Local Write Preference to avoid 2 issues - 1. Rack-by-Rack Maintenance leaves last data replica at risk, 2. avoid Major Storage Imbalance across
[ https://issues.apache.org/jira/browse/HDFS-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038990#comment-17038990 ] Vinayakumar B commented on HDFS-13739: -- +1 > Option to disable Rack Local Write Preference to avoid 2 issues - 1. > Rack-by-Rack Maintenance leaves last data replica at risk, 2. avoid Major > Storage Imbalance across DataNodes caused by uneven spread of Datanodes > across Racks > --- > > Key: HDFS-13739 > URL: https://issues.apache.org/jira/browse/HDFS-13739 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, block placement, datanode, fs, > hdfs, hdfs-client, namenode, nn, performance >Affects Versions: 2.7.3 > Environment: Hortonworks HDP 2.6 >Reporter: Hari Sekhon >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-13739-01.patch > > > Request to be able to disable Rack Local Write preference / Write All > Replicas to different Racks. > Current HDFS write pattern of "local node, rack local node, other rack node" > is good for most purposes but there are at least 2 scenarios where this is > not ideal: > # Rack-by-Rack Maintenance leaves data at risk of losing last remaining > replica. If a single datanode failed it would likely cause some data outage > or even data loss if the rack is lost or an upgrade fails (or perhaps it's a > rack rebuild). Setting replicas to 4 would reduce write performance and waste > storage which is currently the only workaround to that issue. > # Major Storage Imbalance across datanodes when there is an uneven layout of > datanodes across racks - some nodes fill up while others are half empty. > I have observed this storage imbalance on a cluster where half the nodes were > 85% full and the other half were only 50% full. > Rack layouts like the following illustrate this - the nodes in the same rack > will only choose to send half their block replicas to each other, so they > will fill up first, while other nodes will receive far fewer replica blocks: > {code:java} > NumNodes - Rack > 2 - rack 1 > 2 - rack 2 > 1 - rack 3 > 1 - rack 4 > 1 - rack 5 > 1 - rack 6{code} > In this case if I reduce the number of replicas to 2 then I get an almost > perfect spread of blocks across all datanodes because HDFS has no choice but > to maintain the only 2nd replica on a different rack. If I increase the > replicas back to 3 it goes back to 85% on half the nodes and 50% on the other > half, because the extra replicas choose to replicate only to rack local nodes. > Why not just run the HDFS balancer to fix it you might say? This is a heavily > loaded HBase cluster - aside from destroying HBase's data locality and > performance by moving blocks out from underneath RegionServers - as soon as > an HBase major compaction occurs (at least weekly), all blocks will get > re-written by HBase and the HDFS client will again write to local node, rack > local node, other rack node - resulting in the same storage imbalance again. > Hence this cannot be solved by running HDFS balancer on HBase clusters - or > for any application sitting on top of HDFS that has any HDFS block churn. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15127) RBF: Do not allow writes when a subcluster is unavailable for HASH_ALL mount points.
[ https://issues.apache.org/jira/browse/HDFS-15127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035160#comment-17035160 ] Vinayakumar B commented on HDFS-15127: -- +1 > RBF: Do not allow writes when a subcluster is unavailable for HASH_ALL mount > points. > > > Key: HDFS-15127 > URL: https://issues.apache.org/jira/browse/HDFS-15127 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-15127.000.patch, HDFS-15127.001.patch, > HDFS-15127.002.patch, HDFS-15127.003.patch > > > A HASH_ALL mount point should not allow creating new files if one subcluster > is down. > If the file already existed in the past, this could lead to inconsistencies. > We should return an unavailable exception. > {{TestRouterFaultTolerant#testWriteWithFailedSubcluster()}} needs to be > changed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15143) LocatedStripedBlock returns wrong block type
[ https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024972#comment-17024972 ] Vinayakumar B commented on HDFS-15143: -- +1 > LocatedStripedBlock returns wrong block type > > > Key: HDFS-15143 > URL: https://issues.apache.org/jira/browse/HDFS-15143 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15143-01.patch, HDFS-15143-02.patch > > > LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually > should be {{STRIPED}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15117) EC: Add getECTopologyResultForPolicies to DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021786#comment-17021786 ] Vinayakumar B commented on HDFS-15117: -- Looks good to me. +1 Thanks [~ayushtkn] for the update > EC: Add getECTopologyResultForPolicies to DistributedFileSystem > --- > > Key: HDFS-15117 > URL: https://issues.apache.org/jira/browse/HDFS-15117 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15117-01.patch, HDFS-15117-02.patch, > HDFS-15117-03.patch, HDFS-15117-04.patch, HDFS-15117-05.patch, > HDFS-15117-06.patch, HDFS-15117-07.patch, HDFS-15117-08.patch > > > Add getECTopologyResultForPolicies API to distributed filesystem. > It is as of now only present as part of ECAdmin. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15117) EC: Add getECTopologyResultForPolicies to DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019464#comment-17019464 ] Vinayakumar B commented on HDFS-15117: -- Thanks [~ayushtkn] for the update. Following are some nits to be fixed before getting this in. # {{ECTopologyVerifier.java}} Need not be moved. # rename {{ECTopologyVerifierResultResponseProto}} to {{ECTopologyVerifierResultProto}} # Checkstyles need to be fixed. # {{BlockReportLeaseManager.java}} change is unnecessary. # can change 'src' to 'policies' {quote}message GetECTopologyResultForPoliciesRequestProto { repeated string src = 1; {quote} > EC: Add getECTopologyResultForPolicies to DistributedFileSystem > --- > > Key: HDFS-15117 > URL: https://issues.apache.org/jira/browse/HDFS-15117 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15117-01.patch, HDFS-15117-02.patch, > HDFS-15117-03.patch, HDFS-15117-04.patch, HDFS-15117-05.patch, > HDFS-15117-06.patch > > > Add getECTopologyResultForPolicies API to distributed filesystem. > It is as of now only present as part of ECAdmin. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15117) EC: Add getECTopologyResultForPolicies to DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014861#comment-17014861 ] Vinayakumar B commented on HDFS-15117: -- Patch almost looks good to me. [~weichiu] , I too agree that DistributedFileSystem is a thin client. Another fact is that, NameNode side already have this check and exposed as a metric. See {{FSNamesystem#getVerifyECWithTopologyResult()}}. So, it would be better to combine all these (DistributedFileSystem and ECAdmin) to use the server side check. In current implementation, {{getDatanodeReport()}} is used only to find numberOfRacks. {{getDatanodeReport()}} is a bit costly RPC in large cluster. So, to summarize, I feel can add a separate RPC to ClientProtocol and combine existing serverside {{FSNamesystem#getVerifyECWithTopologyResult()}} to return ECTopologyVerifierResult. Call this RPC in both DistributedFileSystem and ECAdmin. > EC: Add getECTopologyResultForPolicies to DistributedFileSystem > --- > > Key: HDFS-15117 > URL: https://issues.apache.org/jira/browse/HDFS-15117 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15117-01.patch, HDFS-15117-02.patch > > > Add getECTopologyResultForPolicies API to distributed filesystem. > It is as of now only present as part of ECAdmin. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14578) AvailableSpaceBlockPlacementPolicy always prefers local node
[ https://issues.apache.org/jira/browse/HDFS-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012530#comment-17012530 ] Vinayakumar B commented on HDFS-14578: -- [HDFS-14578-07.patch|https://issues.apache.org/jira/secure/attachment/12990467/HDFS-14578-07.patch] Looks good. +1 Checkstyles can be ignored. > AvailableSpaceBlockPlacementPolicy always prefers local node > > > Key: HDFS-14578 > URL: https://issues.apache.org/jira/browse/HDFS-14578 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement >Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14578-02.patch, HDFS-14578-03.patch, > HDFS-14578-04.patch, HDFS-14578-05.patch, HDFS-14578-06.patch, > HDFS-14578-07.patch, HDFS-14578-WIP-01.patch, HDFS-14758-01.patch > > > It looks like AvailableSpaceBlockPlacementPolicy prefers local disk just like > in the BlockPlacementPolicyDefault > > As Yongjun mentioned in > [HDFS-8131|https://issues.apache.org/jira/browse/HDFS-8131?focusedCommentId=16558739=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16558739], > > {quote}Class AvailableSpaceBlockPlacementPolicy extends > BlockPlacementPolicyDefault. But it doesn't change the behavior of choosing > the first node in BlockPlacementPolicyDefault, so even with this new feature, > the local DN is always chosen as the first DN (of course when it is not > excluded), and the new feature only changes the selection of the rest of the > two DNs. > {quote} > I'm file this Jira as I groom Cloudera's internal Jira and found this > unreported issue. We do have a customer hitting this problem. I don't have a > fix, but thought it would be beneficial to report it to Apache Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14578) AvailableSpaceBlockPlacementPolicy always prefers local node
[ https://issues.apache.org/jira/browse/HDFS-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011549#comment-17011549 ] Vinayakumar B edited comment on HDFS-14578 at 1/9/20 8:36 AM: -- Code change looks good. some improvements required in tests. Right now, distribution of nodes capacity makes one entire rack to be 75% full and another rack to be empty. This will result in choosing localrack node with same usage (75%) as of local node in {{testChooseLocalNodeWihLocalNodeLoaded()}} Distribute the datanodes evenly in both racks, (i.e. Some are full and some are empty in both racks so that there would be a better rack-local node to choose when local node is full) Tests can be made simple by asserting expected node to be choosen instead calculating the probability. 1. {{testChooseLocalNode()}} assert for local node. 2. {{testChooseLocalNodeWihLocalNodeLoaded()}}, assert for non-local, but rack-local with higher space availability than local node. was (Author: vinayrpet): Code change looks good. some improvements required in tests. Right now, distribution of nodes capacity makes one entire rack to be 75% full and another rack to be empty. This will result in choosing localrack node with same usage (75%) as of local node in {{testChooseLocalNodeWihLocalNodeLoaded()}} Distribute the datanodes evenly in both racks, (i.e. Some are full and some are empty in both racks so that there would be a better rack-local node to choose when local node is full) Tests can be made simple by asserting expected node to be choosen instead calculating the probability. 1. {{testChooseLocalNode()}} assert for local node. 1. {{testChooseLocalNodeWihLocalNodeLoaded()}}, assert for non-local, but rack-local with higher space availability than local node. > AvailableSpaceBlockPlacementPolicy always prefers local node > > > Key: HDFS-14578 > URL: https://issues.apache.org/jira/browse/HDFS-14578 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement >Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14578-02.patch, HDFS-14578-03.patch, > HDFS-14578-04.patch, HDFS-14578-WIP-01.patch, HDFS-14758-01.patch > > > It looks like AvailableSpaceBlockPlacementPolicy prefers local disk just like > in the BlockPlacementPolicyDefault > > As Yongjun mentioned in > [HDFS-8131|https://issues.apache.org/jira/browse/HDFS-8131?focusedCommentId=16558739=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16558739], > > {quote}Class AvailableSpaceBlockPlacementPolicy extends > BlockPlacementPolicyDefault. But it doesn't change the behavior of choosing > the first node in BlockPlacementPolicyDefault, so even with this new feature, > the local DN is always chosen as the first DN (of course when it is not > excluded), and the new feature only changes the selection of the rest of the > two DNs. > {quote} > I'm file this Jira as I groom Cloudera's internal Jira and found this > unreported issue. We do have a customer hitting this problem. I don't have a > fix, but thought it would be beneficial to report it to Apache Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14578) AvailableSpaceBlockPlacementPolicy always prefers local node
[ https://issues.apache.org/jira/browse/HDFS-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011549#comment-17011549 ] Vinayakumar B commented on HDFS-14578: -- Code change looks good. some improvements required in tests. Right now, distribution of nodes capacity makes one entire rack to be 75% full and another rack to be empty. This will result in choosing localrack node with same usage (75%) as of local node in {{testChooseLocalNodeWihLocalNodeLoaded()}} Distribute the datanodes evenly in both racks, (i.e. Some are full and some are empty in both racks so that there would be a better rack-local node to choose when local node is full) Tests can be made simple by asserting expected node to be choosen instead calculating the probability. 1. {{testChooseLocalNode()}} assert for local node. 1. {{testChooseLocalNodeWihLocalNodeLoaded()}}, assert for non-local, but rack-local with higher space availability than local node. > AvailableSpaceBlockPlacementPolicy always prefers local node > > > Key: HDFS-14578 > URL: https://issues.apache.org/jira/browse/HDFS-14578 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement >Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14578-02.patch, HDFS-14578-03.patch, > HDFS-14578-04.patch, HDFS-14578-WIP-01.patch, HDFS-14758-01.patch > > > It looks like AvailableSpaceBlockPlacementPolicy prefers local disk just like > in the BlockPlacementPolicyDefault > > As Yongjun mentioned in > [HDFS-8131|https://issues.apache.org/jira/browse/HDFS-8131?focusedCommentId=16558739=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16558739], > > {quote}Class AvailableSpaceBlockPlacementPolicy extends > BlockPlacementPolicyDefault. But it doesn't change the behavior of choosing > the first node in BlockPlacementPolicyDefault, so even with this new feature, > the local DN is always chosen as the first DN (of course when it is not > excluded), and the new feature only changes the selection of the rest of the > two DNs. > {quote} > I'm file this Jira as I groom Cloudera's internal Jira and found this > unreported issue. We do have a customer hitting this problem. I don't have a > fix, but thought it would be beneficial to report it to Apache Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14578) AvailableSpaceBlockPlacementPolicy always prefers local node
[ https://issues.apache.org/jira/browse/HDFS-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009576#comment-17009576 ] Vinayakumar B commented on HDFS-14578: -- Idea looks good. You can now separate the test and keep default config to false to keep existing behavior. > AvailableSpaceBlockPlacementPolicy always prefers local node > > > Key: HDFS-14578 > URL: https://issues.apache.org/jira/browse/HDFS-14578 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement >Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Priority: Major > Attachments: HDFS-14578-WIP-01.patch > > > It looks like AvailableSpaceBlockPlacementPolicy prefers local disk just like > in the BlockPlacementPolicyDefault > > As Yongjun mentioned in > [HDFS-8131|https://issues.apache.org/jira/browse/HDFS-8131?focusedCommentId=16558739=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16558739], > > {quote}Class AvailableSpaceBlockPlacementPolicy extends > BlockPlacementPolicyDefault. But it doesn't change the behavior of choosing > the first node in BlockPlacementPolicyDefault, so even with this new feature, > the local DN is always chosen as the first DN (of course when it is not > excluded), and the new feature only changes the selection of the rest of the > two DNs. > {quote} > I'm file this Jira as I groom Cloudera's internal Jira and found this > unreported issue. We do have a customer hitting this problem. I don't have a > fix, but thought it would be beneficial to report it to Apache Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15091) Cache Admin and Quota Commands Should Check SuperUser Before Taking Lock
[ https://issues.apache.org/jira/browse/HDFS-15091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008020#comment-17008020 ] Vinayakumar B commented on HDFS-15091: -- +1 LGTM, thanks [~ayushtkn] for spotting this. > Cache Admin and Quota Commands Should Check SuperUser Before Taking Lock > > > Key: HDFS-15091 > URL: https://issues.apache.org/jira/browse/HDFS-15091 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15091-01.patch, HDFS-15091-02.patch > > > As of now all API check superuser before taking lock, Similarly can be done > for the cache commands and setQuota. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987759#comment-16987759 ] Vinayakumar B commented on HDFS-15023: -- Thanks [~ferhui] for contribution and [~ayushtkn] for the ping. Src changes are fine to me. in test, catch exception and fail is unnecessary. Only finally block is sufficient. > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch, HDFS-15023.002.patch, > HDFS-15023.003.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state
[ https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984338#comment-16984338 ] Vinayakumar B commented on HDFS-14961: -- Thanks [~ayushtkn] for the analysis and the fix. Fix looks good to me. +1. There is already a check present in HealthMonitor thread to quitElection when namenode state found to be OBSERVER. {code:java} if (changedState == HAServiceState.OBSERVER) { elector.quitElection(true); serviceState = HAServiceState.OBSERVER; return; }{code} But this is an async monitoring happening every 1 second. In case of manual transition, state can change directly in NameNode. So ZKFC syncs during monitoring and quits election. As [~ferhui] suggested, checking for the state before joining the election also doesn't hurt. Can be added as a separate Improvement Jira as [~ayushtkn] already said. {code:java} if(serviceState != HAServiceState.OBSERVER) { elector.joinElection(targetToData(localTarget)); }{code} > [SBN read] Prevent ZKFC changing Observer Namenode state > > > Key: HDFS-14961 > URL: https://issues.apache.org/jira/browse/HDFS-14961 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, > HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch > > > HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC > running along with the observer NOde. > The Observer namenode isn't suppose to be part of ZKFC election process. > But if the Namenode was part of election, before turning into Observer by > transitionToObserver Command. The ZKFC still sends instruction to the > Namenode as a result of previous participation and sometimes tend to change > the state of Observer to Standby. > This is also the reason for failure in TestDFSZKFailoverController. > TestDFSZKFailoverController has been consistently failing with a time out > waiting in testManualFailoverWithDFSHAAdmin(). In particular > {{waitForHAState(1, HAServiceState.OBSERVER);}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14921) Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing
[ https://issues.apache.org/jira/browse/HDFS-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14921: - Fix Version/s: 3.2.2 3.1.4 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing > - > > Key: HDFS-14921 > URL: https://issues.apache.org/jira/browse/HDFS-14921 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14921-01.patch > > > Earlier StoragePolicy were part of DFSAdmin and operations of StoragePolicy > required SuperUser Check, But that got removed long back, But the Check in > getListing was left. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14921) Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing
[ https://issues.apache.org/jira/browse/HDFS-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958606#comment-16958606 ] Vinayakumar B commented on HDFS-14921: -- Committed to trunk, branch-3.2 branch-3.1 Thanks [~ayushtkn] for the find and patch. > Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing > - > > Key: HDFS-14921 > URL: https://issues.apache.org/jira/browse/HDFS-14921 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14921-01.patch > > > Earlier StoragePolicy were part of DFSAdmin and operations of StoragePolicy > required SuperUser Check, But that got removed long back, But the Check in > getListing was left. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14921) Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing
[ https://issues.apache.org/jira/browse/HDFS-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958068#comment-16958068 ] Vinayakumar B commented on HDFS-14921: -- +1 > Remove SuperUser Check in Setting Storage Policy in FileStatus During Listing > - > > Key: HDFS-14921 > URL: https://issues.apache.org/jira/browse/HDFS-14921 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14921-01.patch > > > Earlier StoragePolicy were part of DFSAdmin and operations of StoragePolicy > required SuperUser Check, But that got removed long back, But the Check in > getListing was left. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14880) Balancer sequence of statistics & exit message is not correct
[ https://issues.apache.org/jira/browse/HDFS-14880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957665#comment-16957665 ] Vinayakumar B commented on HDFS-14880: -- I don't think this will be an Incompatible change. Line itself is not changed. Only when it appears is changed. *The cluster is balanced. Exiting...* Before this message comes, there might be multiple messages about stats (with non-zero values) So printing the same message with values as 0, should not be a problem. Coming to patch, I think, you should print the message inside for loop based on return status. > Balancer sequence of statistics & exit message is not correct > - > > Key: HDFS-14880 > URL: https://issues.apache.org/jira/browse/HDFS-14880 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.1.1, 3.2.1 > Environment: Run the balancer tool in cluster. >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Attachments: HDFS-14880.0001.patch > > > Actual: > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved > The cluster is balanced. Exiting... > Sep 27, 2019 5:13:15 PM 0 0 B 0 B > 0 B > Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds > Done! > Expected: Exit message should be after loggin all the balancer movement > statistics data. > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved > Sep 27, 2019 5:13:15 PM 0 0 B 0 B > 0 B > The cluster is balanced. Exiting... > Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds > Done! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14915) Move Superuser Check Before Taking Lock For Encryption API
[ https://issues.apache.org/jira/browse/HDFS-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956931#comment-16956931 ] Vinayakumar B commented on HDFS-14915: -- +1 > Move Superuser Check Before Taking Lock For Encryption API > -- > > Key: HDFS-14915 > URL: https://issues.apache.org/jira/browse/HDFS-14915 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14915-01.patch, HDFS-14915-02.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14384) When lastLocatedBlock token expire, it will take 1~3s second to refetch it.
[ https://issues.apache.org/jira/browse/HDFS-14384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950724#comment-16950724 ] Vinayakumar B commented on HDFS-14384: -- Thanks [~surendrasingh] for the finding and the patch. Changes looks good to me. +1. Before commit, Please wait for [~daryn] to have a look at latest patch. > When lastLocatedBlock token expire, it will take 1~3s second to refetch it. > --- > > Key: HDFS-14384 > URL: https://issues.apache.org/jira/browse/HDFS-14384 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-14384.001.patch, HDFS-14384.002.patch > > > Scenario : > 1. Write file with one block which is in-progress. > 2. Open input stream and close the output stream. > 3. Wait for block token expiration and read the data. > 4. Last block read take 1~3 sec to read it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14900) Fix requirements of BUILDING.txt for libhdfspp which depends on libprotobuf
[ https://issues.apache.org/jira/browse/HDFS-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948471#comment-16948471 ] Vinayakumar B commented on HDFS-14900: -- Lets not make it skip libhdfspp build silently when "-Pnative" is specifically given. It would be a behavior change in the build. Adding back the protobuf library to build environment does solve the issue. Please keep build environment change alone. > Fix requirements of BUILDING.txt for libhdfspp which depends on libprotobuf > --- > > Key: HDFS-14900 > URL: https://issues.apache.org/jira/browse/HDFS-14900 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Attachments: HDFS-14900.001.patch > > > HADOOP-16558 removed protocol buffers from build requirements but libhdfspp > requires libprotobuf and libprotoc. {{-Pnative}} build fails if protocol > buffers is not installed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14655) [SBN Read] Namenode crashes if one of The JN is down
[ https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937134#comment-16937134 ] Vinayakumar B commented on HDFS-14655: -- +1 > [SBN Read] Namenode crashes if one of The JN is down > > > Key: HDFS-14655 > URL: https://issues.apache.org/jira/browse/HDFS-14655 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Critical > Attachments: HDFS-14655-01.patch, HDFS-14655-02.patch, > HDFS-14655-03.patch, HDFS-14655-04.patch, HDFS-14655-05.patch, > HDFS-14655-06.patch, HDFS-14655-07.patch, HDFS-14655-08.patch, > HDFS-14655.poc.patch > > > {noformat} > 2019-07-04 17:35:54,064 | INFO | Logger channel (from parallel executor) to > XXX/XXX | Retrying connect to server: XXX/XXX. Already tried > 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) | Client.java:975 > 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered > while tailing edits. Shutting down standby NN. | EditLogTailer.java:474 > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) > at > com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440) > at > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565) > at > org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508) > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > 2019-07-04 17:35:54,112 | INFO | Edit log tailer | Exiting with status 1: > java.lang.OutOfMemoryError: unable to create new native thread | > ExitUtil.java:210 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14808) EC: Improper size values for corrupt ec block in LOG
[ https://issues.apache.org/jira/browse/HDFS-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937129#comment-16937129 ] Vinayakumar B commented on HDFS-14808: -- +1 > EC: Improper size values for corrupt ec block in LOG > - > > Key: HDFS-14808 > URL: https://issues.apache.org/jira/browse/HDFS-14808 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14808-01.patch > > > If the block corruption reason is size mismatch the log. The values shown and > compared are ambiguous. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14655) [SBN Read] Namenode crashes if one of The JN is down
[ https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935114#comment-16935114 ] Vinayakumar B commented on HDFS-14655: -- Thanks [~ayushtkn] for detailed investigation and patch. Initially I too wasn't able to fail the test without src changes. It was always passing. {quote}conf.setInt(CommonConfigurationKeysPublic.IPC_CLIENT_CONNECT_MAX_RETRIES_KEY, 0); {quote} Commenting this line failed the test without src changes and with src changes it passed as expected. So [~ayushtkn], can you update the test to enable retries only for this test. Not for others, as mentioned it slows down test if its enabled for all tests. Configuration name *{{dfs.ha.tail-edits.num-threads}}* does not seem to be correct. Problem area is Qjournal. Not edit tailing. Edit tailing is the feature which uses these APIs. it could be named something like *{{dfs.qjournal.parallel-read.num-threads}}* ? > [SBN Read] Namenode crashes if one of The JN is down > > > Key: HDFS-14655 > URL: https://issues.apache.org/jira/browse/HDFS-14655 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Critical > Attachments: HDFS-14655-01.patch, HDFS-14655-02.patch, > HDFS-14655-03.patch, HDFS-14655-04.patch, HDFS-14655-05.patch, > HDFS-14655-06.patch, HDFS-14655-07.patch, HDFS-14655.poc.patch > > > {noformat} > 2019-07-04 17:35:54,064 | INFO | Logger channel (from parallel executor) to > XXX/XXX | Retrying connect to server: XXX/XXX. Already tried > 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) | Client.java:975 > 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered > while tailing edits. Shutting down standby NN. | EditLogTailer.java:474 > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) > at > com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440) > at > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565) > at > org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508) > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > 2019-07-04 17:35:54,112 | INFO | Edit log tailer | Exiting with status 1: > java.lang.OutOfMemoryError: unable to create new native thread | > ExitUtil.java:210 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14807) SetTimes updates all negative values apart from -1
[ https://issues.apache.org/jira/browse/HDFS-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921184#comment-16921184 ] Vinayakumar B commented on HDFS-14807: -- +1 > SetTimes updates all negative values apart from -1 > -- > > Key: HDFS-14807 > URL: https://issues.apache.org/jira/browse/HDFS-14807 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14807-01.patch, HDFS-14807-02.patch > > > Set Times API, updates negative time on all negative values apart from -1. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified
[ https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921176#comment-16921176 ] Vinayakumar B commented on HDFS-12212: -- thanks [~ayushtkn] for review and commit. [~hanishakoneru] and [~jojochuang] for reviews. > Options.Rename.To_TRASH is considered even when Options.Rename.NONE is > specified > > > Key: HDFS-12212 > URL: https://issues.apache.org/jira/browse/HDFS-12212 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2 >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.4 > > Attachments: HDFS-12212-01.patch > > > HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the > movement to trash and other renames for permission checks. > When Options.Rename.NONE is passed also TO_TRASH is considered for rename and > wrong permissions are checked for rename. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14581) Appending to EC files crashes NameNode
[ https://issues.apache.org/jira/browse/HDFS-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868262#comment-16868262 ] Vinayakumar B commented on HDFS-14581: -- Fix looks good to me. Once the following message change is done, it will be good to go. {quote} Regarding the exception message, may be an added line of suggestion to use the flag as of now can be done. {quote} > Appending to EC files crashes NameNode > -- > > Key: HDFS-14581 > URL: https://issues.apache.org/jira/browse/HDFS-14581 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.3.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Critical > Attachments: HDFS-14581.001.patch > > > *org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.prepareFileForAppend(..)@189* > {noformat} > file.recordModification(iip.getLatestSnapshotId()); > file.toUnderConstruction(leaseHolder, clientMachine); > fsn.getLeaseManager().addLease( > file.getFileUnderConstructionFeature().getClientName(), file.getId()); > LocatedBlock ret = null; > if (!newBlock) { > if (file.isStriped()) { > throw new UnsupportedOperationException( > "Append on EC file without new block is not supported."); > }{noformat} > In this code "UnsupportedOperationException" exception thows after marking > file underConstruction. In this case file is opened without any "Open" > editlogs, after some time lease manager close this file and add close edit > log. > When SBN tail this edit log, it will fail with this exception. > {noformat} > 2019-06-13 19:17:51,513 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation CloseOp [length=0, inodeId=0, path=/ECtest/, > replication=1, mtime=1560261947480, atime=1560258249117, blockSize=134217728, > blocks=[blk_-9223372036854775792_1005], permissions=root:hadoop:rw-r--r--, > aclEntries=null, clientName=, clientMachine=, overwrite=false, > storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_CLOSE, txid=1363] > java.io.IOException: File is not under construction: > /ECtest/container-executor > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:504) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:329) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:485) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845563#comment-16845563 ] Vinayakumar B commented on HDFS-14440: -- {quote}My only concern was the replacement of invokeSequential with invokeConcurrent. The functionality will be the same but the now we have a trade-off between latency and number of rpc calls. {quote} Yes, its a trade-off. But we should be more lean towards overall latency of Client's operation than number of RPC calls from routers(which occurs in parallel and results in no/negligible overhead). As explained by [~ayushtkn] earlier, in case of File-not-Exist-already, total number of RPC remains same, but latency improves a lot. This is usually the most frequent case. In case of file-already-exist, total number of RPCs, will be same as number of remote locations. Since executed in parallel, overall latency will be negligible. This case is less frequent as compared to former case. I guess, this should be fair enough trade-off for client's RPC performance improvement. > RBF: Optimize the file write process in case of multiple destinations. > -- > > Key: HDFS-14440 > URL: https://issues.apache.org/jira/browse/HDFS-14440 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14440-HDFS-13891-01.patch, > HDFS-14440-HDFS-13891-02.patch, HDFS-14440-HDFS-13891-03.patch, > HDFS-14440-HDFS-13891-04.patch, HDFS-14440-HDFS-13891-05.patch, > HDFS-14440-HDFS-13891-06.patch > > > In case of multiple destinations, We need to check if the file already exists > in one of the subclusters for which we use the existing getBlockLocation() > API which is by default a sequential Call, > In an ideal scenario where the file needs to be created each subcluster shall > be checked sequentially, this can be done concurrently to save time. > In another case where the file is found and if the last block is null, we > need to do getFileInfo to all the locations to get the location where the > file exists. This also can be prevented by use of ConcurrentCall since we > shall be having the remoteLocation to where the getBlockLocation returned a > non null entry. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.
[ https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844738#comment-16844738 ] Vinayakumar B commented on HDFS-14440: -- {quote}I'm still wrapping my head around using invokeConcurrent() and invokeSequential()... What about using sequential for HASH and HASH_ALL and concurrent for the others? {quote} I can see, this would not be a problem at all in the latest patch. Though {{involeConcurrent()}} is used, result check from the returned map happens according to original list of remote locations which is based on the order. Also, {{getFileInfo()}} is much lighter compared to {{getBlockLocations()}}. So, I would say changes are direct and straight forward instead of earlier complex check with blocklocations. +1 > RBF: Optimize the file write process in case of multiple destinations. > -- > > Key: HDFS-14440 > URL: https://issues.apache.org/jira/browse/HDFS-14440 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14440-HDFS-13891-01.patch, > HDFS-14440-HDFS-13891-02.patch, HDFS-14440-HDFS-13891-03.patch, > HDFS-14440-HDFS-13891-04.patch, HDFS-14440-HDFS-13891-05.patch, > HDFS-14440-HDFS-13891-06.patch > > > In case of multiple destinations, We need to check if the file already exists > in one of the subclusters for which we use the existing getBlockLocation() > API which is by default a sequential Call, > In an ideal scenario where the file needs to be created each subcluster shall > be checked sequentially, this can be done concurrently to save time. > In another case where the file is found and if the last block is null, we > need to do getFileInfo to all the locations to get the location where the > file exists. This also can be prevented by use of ConcurrentCall since we > shall be having the remoteLocation to where the getBlockLocation returned a > non null entry. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7663) Erasure Coding: Append on striped file
[ https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7663: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 Target Version/s: (was: ) Status: Resolved (was: Patch Available) Committed to trunk. Thanks all. > Erasure Coding: Append on striped file > -- > > Key: HDFS-7663 > URL: https://issues.apache.org/jira/browse/HDFS-7663 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha1 >Reporter: Jing Zhao >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-7663-02.patch, HDFS-7663-03.patch, > HDFS-7663-04.patch, HDFS-7663-05.patch, HDFS-7663-06.patch, HDFS-7663.00.txt, > HDFS-7663.01.patch > > > Append should be easy if we have variable length block support from > HDFS-3689, i.e., the new data will be appended to a new block. We need to > revisit whether and how to support appending data to the original last block. > 1. Append to a closed striped file, with NEW_BLOCK flag enabled (this) > 2. Append to a under-construction striped file, with NEW_BLOCK flag enabled > (HDFS-9173) > 3. Append to a striped file, by appending to last block group (follow-on) > This jira attempts to implement the #1, and also track #2, #3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7663) Erasure Coding: Append on striped file
[ https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784458#comment-16784458 ] Vinayakumar B commented on HDFS-7663: - +1 on latest patch. > Erasure Coding: Append on striped file > -- > > Key: HDFS-7663 > URL: https://issues.apache.org/jira/browse/HDFS-7663 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha1 >Reporter: Jing Zhao >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-7663-02.patch, HDFS-7663-03.patch, > HDFS-7663-04.patch, HDFS-7663-05.patch, HDFS-7663-06.patch, HDFS-7663.00.txt, > HDFS-7663.01.patch > > > Append should be easy if we have variable length block support from > HDFS-3689, i.e., the new data will be appended to a new block. We need to > revisit whether and how to support appending data to the original last block. > 1. Append to a closed striped file, with NEW_BLOCK flag enabled (this) > 2. Append to a under-construction striped file, with NEW_BLOCK flag enabled > (HDFS-9173) > 3. Append to a striped file, by appending to last block group (follow-on) > This jira attempts to implement the #1, and also track #2, #3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7663) Erasure Coding: Append on striped file
[ https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779586#comment-16779586 ] Vinayakumar B commented on HDFS-7663: - Thanks [~ayushtkn] for the update on the patch. One more minor change. {code:java} LocatedBlock ret = null; if (!newBlock) { + if (file.isStriped()) { +throw new UnsupportedActionException( +"Append on EC file without new block is not supported."); + } {code} Instead of {{UnsupportedActionException}}, throw {{UnsupportedOperationException}} as before. +1, once this change is done. > Erasure Coding: Append on striped file > -- > > Key: HDFS-7663 > URL: https://issues.apache.org/jira/browse/HDFS-7663 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha1 >Reporter: Jing Zhao >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-7663-02.patch, HDFS-7663-03.patch, > HDFS-7663-04.patch, HDFS-7663-05.patch, HDFS-7663.00.txt, HDFS-7663.01.patch > > > Append should be easy if we have variable length block support from > HDFS-3689, i.e., the new data will be appended to a new block. We need to > revisit whether and how to support appending data to the original last block. > 1. Append to a closed striped file, with NEW_BLOCK flag enabled (this) > 2. Append to a under-construction striped file, with NEW_BLOCK flag enabled > (HDFS-9173) > 3. Append to a striped file, by appending to last block group (follow-on) > This jira attempts to implement the #1, and also track #2, #3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7133) Support clearing namespace quota on "/"
[ https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7133: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 Release Note: Namespace Quota on root can be cleared now. Status: Resolved (was: Patch Available) Committed to Trunk. Thanks [~ayushtkn] for the patch. Thanks [~rguo] for reporting. > Support clearing namespace quota on "/" > --- > > Key: HDFS-7133 > URL: https://issues.apache.org/jira/browse/HDFS-7133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Guo Ruijing >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-7133-01.patch > > > existing implementation: > 1. support set namespace quota on "/" > 2. doesn't support clear namespace quota on "/" due to HDFS-1258 > expected implementation: > support clearing namespace quota on "/" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7133) Support clearing namespace quota on "/"
[ https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777216#comment-16777216 ] Vinayakumar B commented on HDFS-7133: - Thanks for the confirmation. +1 > Support clearing namespace quota on "/" > --- > > Key: HDFS-7133 > URL: https://issues.apache.org/jira/browse/HDFS-7133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Guo Ruijing >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-7133-01.patch > > > existing implementation: > 1. support set namespace quota on "/" > 2. doesn't support clear namespace quota on "/" due to HDFS-1258 > expected implementation: > support clearing namespace quota on "/" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7663) Erasure Coding: Append on striped file
[ https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776655#comment-16776655 ] Vinayakumar B commented on HDFS-7663: - Thanks [~ayushtkn] for taking this up. Following are the comments from patch. 1. {code} protected final AtomicReference cachingStrategy; - private FileEncryptionInfo fileEncryptionInfo; + protected FileEncryptionInfo fileEncryptionInfo; private int writePacketSize; {code} {code} - private DFSOutputStream(DFSClient dfsClient, String src, + protected DFSOutputStream(DFSClient dfsClient, String src, EnumSet flag, {code} These changes are unnecessary 2. {code} + /** Construct a new output stream for appending to a file. */ + DFSStripedOutputStream(DFSClient dfsClient, String src, + EnumSet flags, Progressable progress, LocatedBlock lastBlock, + HdfsFileStatus stat, DataChecksum checksum, String[] favoredNodes) + throws IOException { +super(dfsClient, src, stat, flags, progress, checksum, favoredNodes, false); +if (LOG.isDebugEnabled()) { + LOG.debug("Creating DFSStripedOutputStream for " + src); +} . . . {code} You can re-use the existing constructor itself. Just add remaining statements. No need to validate the NEW_BLOCK flag here. It will be already validated during namenode call itself. 3. {code} // not support appending file with striped blocks - if (file.isStriped()) { + if (file.isStriped() && file.isUnderConstruction()) { throw new UnsupportedOperationException( "Cannot append to files with striped block " + path); } {code} Remove the entire check itself. For Underconstruction, AlreadyBeingCreatedException will be thrown later. 4. Also, add Validation of NEW_BLOCK flag in the namenode-side, until support of append to existing blocks. > Erasure Coding: Append on striped file > -- > > Key: HDFS-7663 > URL: https://issues.apache.org/jira/browse/HDFS-7663 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha1 >Reporter: Jing Zhao >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-7663-02.patch, HDFS-7663-03.patch, > HDFS-7663-04.patch, HDFS-7663.00.txt, HDFS-7663.01.patch > > > Append should be easy if we have variable length block support from > HDFS-3689, i.e., the new data will be appended to a new block. We need to > revisit whether and how to support appending data to the original last block. > 1. Append to a closed striped file, with NEW_BLOCK flag enabled (this) > 2. Append to a under-construction striped file, with NEW_BLOCK flag enabled > (HDFS-9173) > 3. Append to a striped file, by appending to last block group (follow-on) > This jira attempts to implement the #1, and also track #2, #3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7133) Support clearing namespace quota on "/"
[ https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776657#comment-16776657 ] Vinayakumar B commented on HDFS-7133: - Changes looks good. Please confirm whether issue mentioned in HDFS-1258 is not a problem after fix.? And Please validate test failures. +1, once above things are confirmed. > Support clearing namespace quota on "/" > --- > > Key: HDFS-7133 > URL: https://issues.apache.org/jira/browse/HDFS-7133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Guo Ruijing >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-7133-01.patch > > > existing implementation: > 1. support set namespace quota on "/" > 2. doesn't support clear namespace quota on "/" due to HDFS-1258 > expected implementation: > support clearing namespace quota on "/" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13209) DistributedFileSystem.create should allow an option to provide StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768263#comment-16768263 ] Vinayakumar B commented on HDFS-13209: -- Checkstyle can be ignored for now. test failures seems unrelated. +1 > DistributedFileSystem.create should allow an option to provide StoragePolicy > > > Key: HDFS-13209 > URL: https://issues.apache.org/jira/browse/HDFS-13209 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Jean-Marc Spaggiari >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-13209-01.patch, HDFS-13209-02.patch, > HDFS-13209-03.patch, HDFS-13209-04.patch, HDFS-13209-05.patch, > HDFS-13209-06.patch > > > DistributedFileSystem.create allows to get a FSDataOutputStream. The stored > file and related blocks will used the directory based StoragePolicy. > > However, sometime, we might need to keep all files in the same directory > (consistency constraint) but might want some of them on SSD (small, in my > case) until they are processed and merger/removed. Then they will go on the > default policy. > > When creating a file, it will be useful to have an option to specify a > different StoragePolicy... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14266) EC : Fsck -blockId shows null for EC Blocks if One Block Is Not Available.
[ https://issues.apache.org/jira/browse/HDFS-14266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14266: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.3 3.2.1 3.3.0 Status: Resolved (was: Patch Available) Committed to trunk, branch-3.2 and branch-3.1 Thanks [~ayushtkn] for the fix. Thanks [~Harsha1206] for reporting. > EC : Fsck -blockId shows null for EC Blocks if One Block Is Not Available. > -- > > Key: HDFS-14266 > URL: https://issues.apache.org/jira/browse/HDFS-14266 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Labels: EC > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14266-01.patch, HDFS-14266-02.patch > > > If one block gets removed from the block group then the datanode information > for the block group comes shows null. > > {noformat} > Block Id: blk_-9223372036854775792 > Block belongs to: /ec/file1 > No. of Expected Replica: 2 > No. of live Replica: 2 > No. of excess Replica: 0 > No. of stale Replica: 0 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > Fsck on blockId 'blk_-9223372036854775792 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14266) EC : Fsck -blockId shows null for EC Blocks if One Block Is Not Available.
[ https://issues.apache.org/jira/browse/HDFS-14266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14266: - Summary: EC : Fsck -blockId shows null for EC Blocks if One Block Is Not Available. (was: EC : Unable To Get Datanode Info for EC Blocks if One Block Is Not Available.) > EC : Fsck -blockId shows null for EC Blocks if One Block Is Not Available. > -- > > Key: HDFS-14266 > URL: https://issues.apache.org/jira/browse/HDFS-14266 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Labels: EC > Attachments: HDFS-14266-01.patch, HDFS-14266-02.patch > > > If one block gets removed from the block group then the datanode information > for the block group comes shows null. > > {noformat} > Block Id: blk_-9223372036854775792 > Block belongs to: /ec/file1 > No. of Expected Replica: 2 > No. of live Replica: 2 > No. of excess Replica: 0 > No. of stale Replica: 0 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > Fsck on blockId 'blk_-9223372036854775792 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13209) DistributedFileSystem.create should allow an option to provide StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766067#comment-16766067 ] Vinayakumar B commented on HDFS-13209: -- Overall change looks fine. Instead of exposing one more public API in {{DistributedFileSystem}}, follow the builder pattern used for ecPolicy. Add a setter method for storagePolicy in {{HdfsDataOutputStreamBuilder}} and use the same during create. > DistributedFileSystem.create should allow an option to provide StoragePolicy > > > Key: HDFS-13209 > URL: https://issues.apache.org/jira/browse/HDFS-13209 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Jean-Marc Spaggiari >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-13209-01.patch, HDFS-13209-02.patch, > HDFS-13209-03.patch, HDFS-13209-04.patch > > > DistributedFileSystem.create allows to get a FSDataOutputStream. The stored > file and related blocks will used the directory based StoragePolicy. > > However, sometime, we might need to keep all files in the same directory > (consistency constraint) but might want some of them on SSD (small, in my > case) until they are processed and merger/removed. Then they will go on the > default policy. > > When creating a file, it will be useful to have an option to specify a > different StoragePolicy... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14255) Tail Follow Interval Should Allow To Specify The Sleep Interval To Save Unnecessary RPC's
[ https://issues.apache.org/jira/browse/HDFS-14255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765755#comment-16765755 ] Vinayakumar B commented on HDFS-14255: -- +1, committing later today. > Tail Follow Interval Should Allow To Specify The Sleep Interval To Save > Unnecessary RPC's > -- > > Key: HDFS-14255 > URL: https://issues.apache.org/jira/browse/HDFS-14255 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14255-01.patch, HDFS-14255-02.patch > > > As of now tail -f follows every 5 seconds. We should allow a parameter to > specify this sleep interval. Linux has this configurable as in form of -s > parameter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14266) EC : Unable To Get Datanode Info for EC Blocks if One Block Is Not Available.
[ https://issues.apache.org/jira/browse/HDFS-14266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765751#comment-16765751 ] Vinayakumar B commented on HDFS-14266: -- +1. Committing later today. > EC : Unable To Get Datanode Info for EC Blocks if One Block Is Not Available. > - > > Key: HDFS-14266 > URL: https://issues.apache.org/jira/browse/HDFS-14266 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Labels: EC > Attachments: HDFS-14266-01.patch, HDFS-14266-02.patch > > > If one block gets removed from the block group then the datanode information > for the block group comes shows null. > > {noformat} > Block Id: blk_-9223372036854775792 > Block belongs to: /ec/file1 > No. of Expected Replica: 2 > No. of live Replica: 2 > No. of excess Replica: 0 > No. of stale Replica: 0 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > Fsck on blockId 'blk_-9223372036854775792 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14139) FsShell ls and stat command return different Modification Time on display.
[ https://issues.apache.org/jira/browse/HDFS-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762462#comment-16762462 ] Vinayakumar B commented on HDFS-14139: -- Its not good idea change the output of a shell command. It might break lot of scripts created based on Shell Command output. Unfortunately, we dont know the reason behind the different timezones. But its clear in the 'stat' usage that it shows UTC date/time. > FsShell ls and stat command return different Modification Time on display. > -- > > Key: HDFS-14139 > URL: https://issues.apache.org/jira/browse/HDFS-14139 > Project: Hadoop HDFS > Issue Type: Improvement > Components: fs, shell >Reporter: Fred Peng >Assignee: Ayush Saxena >Priority: Major > Labels: easyfix > Attachments: HDFS-14139-01.patch, HDFS-14139-02.patch > > > When we run "hdfs dfs -ls" or "hdfs dfs -stat" on the same file/directory, > the time of results are different. > Like this: > >> $ ./hdfs dfs -stat /user/xxx/collie-pt-canary > >> 2018-12-10 10:04:57 > >> ./hdfs dfs -ls /user/xxx/collie-pt-canary > >> -rw-r--r-- 3 xxx supergroup 0 2018-12-10 18:04 > Strangely, we found the time is different(8 hours). The stat command uses UTC > timezone, but the Ls command uses system local timezone. > Why does the stat command use UTC timezone, but Ls not? > {code:java} > // in Stat.java > timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss"); > timeFmt.setTimeZone(TimeZone.getTimeZone("UTC"));{code} > By the way, in Unix/Linux the ls and stat return the same time on display. > Should we unify the timezone? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14139) FsShell ls and stat command return different Modification Time on display.
[ https://issues.apache.org/jira/browse/HDFS-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762462#comment-16762462 ] Vinayakumar B edited comment on HDFS-14139 at 2/7/19 8:06 AM: -- Its not good idea to change the output of a shell command. It might break lot of scripts created based on Shell Command output. Unfortunately, we dont know the reason behind the different timezones. But its clear in the 'stat' usage that it shows UTC date/time. was (Author: vinayrpet): Its not good idea change the output of a shell command. It might break lot of scripts created based on Shell Command output. Unfortunately, we dont know the reason behind the different timezones. But its clear in the 'stat' usage that it shows UTC date/time. > FsShell ls and stat command return different Modification Time on display. > -- > > Key: HDFS-14139 > URL: https://issues.apache.org/jira/browse/HDFS-14139 > Project: Hadoop HDFS > Issue Type: Improvement > Components: fs, shell >Reporter: Fred Peng >Assignee: Ayush Saxena >Priority: Major > Labels: easyfix > Attachments: HDFS-14139-01.patch, HDFS-14139-02.patch > > > When we run "hdfs dfs -ls" or "hdfs dfs -stat" on the same file/directory, > the time of results are different. > Like this: > >> $ ./hdfs dfs -stat /user/xxx/collie-pt-canary > >> 2018-12-10 10:04:57 > >> ./hdfs dfs -ls /user/xxx/collie-pt-canary > >> -rw-r--r-- 3 xxx supergroup 0 2018-12-10 18:04 > Strangely, we found the time is different(8 hours). The stat command uses UTC > timezone, but the Ls command uses system local timezone. > Why does the stat command use UTC timezone, but Ls not? > {code:java} > // in Stat.java > timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss"); > timeFmt.setTimeZone(TimeZone.getTimeZone("UTC"));{code} > By the way, in Unix/Linux the ls and stat return the same time on display. > Should we unify the timezone? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14193) RBF: Inconsistency with the Default Namespace
[ https://issues.apache.org/jira/browse/HDFS-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743990#comment-16743990 ] Vinayakumar B commented on HDFS-14193: -- Committed to HDFS-13891. Thanks [~ayushtkn] and [~elgoiri]. > RBF: Inconsistency with the Default Namespace > - > > Key: HDFS-14193 > URL: https://issues.apache.org/jira/browse/HDFS-14193 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: HDFS-13891 > > Attachments: HDFS-14193-HDFS-13891-01.patch, > HDFS-14193-HDFS-13891-02.patch > > > In the present scenario, if the default nameservice is not explicitly > mentioned.Each router fallbacks to it local namespace as Default.There in > each router having different default namespaces. Which leads to > inconsistencies in operations and even blocks in maintaining a global uniform > state. The outputs becomes specific to which router is serving the request > and is different with different routers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14124) EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14124: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.1 3.3.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-3.2. Thanks [~ayushtkn] for contribution and [~SouryakantaDwivedy] for reporting. Added missing comma(,) in the json response in document. > EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs > - > > Key: HDFS-14124 > URL: https://issues.apache.org/jira/browse/HDFS-14124 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, httpfs, webhdfs >Reporter: Souryakanta Dwivedy >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-14124-01.patch, HDFS-14124-02.patch, > HDFS-14124-03.patch, HDFS-14124-04.patch, HDFS-14124-04.patch > > > EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14124) EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14124: - Description: EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs > EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs > - > > Key: HDFS-14124 > URL: https://issues.apache.org/jira/browse/HDFS-14124 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, httpfs, webhdfs >Reporter: Souryakanta Dwivedy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14124-01.patch, HDFS-14124-02.patch, > HDFS-14124-03.patch, HDFS-14124-04.patch, HDFS-14124-04.patch > > > EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14124) EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14124: - Summary: EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs (was: EC : Support Directory Level EC Command (set/get/unset EC policy) through REST API) > EC : Support EC Commands (set/get/unset EcPolicy) via WebHdfs > - > > Key: HDFS-14124 > URL: https://issues.apache.org/jira/browse/HDFS-14124 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, httpfs, webhdfs >Reporter: Souryakanta Dwivedy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14124-01.patch, HDFS-14124-02.patch, > HDFS-14124-03.patch, HDFS-14124-04.patch, HDFS-14124-04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14124) EC : Support Directory Level EC Command (set/get/unset EC policy) through REST API
[ https://issues.apache.org/jira/browse/HDFS-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717012#comment-16717012 ] Vinayakumar B commented on HDFS-14124: -- +1, Committing shortly > EC : Support Directory Level EC Command (set/get/unset EC policy) through > REST API > -- > > Key: HDFS-14124 > URL: https://issues.apache.org/jira/browse/HDFS-14124 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, httpfs, webhdfs >Reporter: Souryakanta Dwivedy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14124-01.patch, HDFS-14124-02.patch, > HDFS-14124-03.patch, HDFS-14124-04.patch, HDFS-14124-04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies
[ https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14113: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) Thanks [~ayushtkn] for the contribution. Thanks [~knanasi] for review. Committed to trunk. > EC : Add Configuration to restrict UserDefined Policies > --- > > Key: HDFS-14113 > URL: https://issues.apache.org/jira/browse/HDFS-14113 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14113-01.patch, HDFS-14113-02.patch, > HDFS-14113-03.patch > > > By default addition of erasure coding policies is enabled for users.We need > to add configuration whether to allow addition of new User Defined policies > or not.Which can be configured in for of a Boolean value at the server side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies
[ https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709703#comment-16709703 ] Vinayakumar B commented on HDFS-14113: -- +1 > EC : Add Configuration to restrict UserDefined Policies > --- > > Key: HDFS-14113 > URL: https://issues.apache.org/jira/browse/HDFS-14113 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14113-01.patch, HDFS-14113-02.patch, > HDFS-14113-03.patch > > > By default addition of erasure coding policies is enabled for users.We need > to add configuration whether to allow addition of new User Defined policies > or not.Which can be configured in for of a Boolean value at the server side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies
[ https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708483#comment-16708483 ] Vinayakumar B commented on HDFS-14113: -- Thanks [~ayushtkn] for the patch. Please find the comments below. {code} + +userDefinedAllowed = conf.getBoolean( +DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_USSERPOLICIES_ALLOWED_KEY, +DFSConfigKeys. +DFS_NAMENODE_EC_POLICIES_USSERPOLICIES_ALLOWED_KEY_DEFAULT); } {code} Please correct the Typo. {code} public synchronized ErasureCodingPolicy addPolicy( ErasureCodingPolicy policy) { +if (!userDefinedAllowed) { + throw new HadoopIllegalArgumentException( + "Addition of user defined erasure coding policy is disabled."); +} + {code} Instead of throwing {{HadoopIllegalArgumentException}}, can throw {{UnsuportedOperationException}} with proper message. > EC : Add Configuration to restrict UserDefined Policies > --- > > Key: HDFS-14113 > URL: https://issues.apache.org/jira/browse/HDFS-14113 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14113-01.patch, HDFS-14113-02.patch > > > By default addition of erasure coding policies is enabled for users.We need > to add configuration whether to allow addition of new User Defined policies > or not.Which can be configured in for of a Boolean value at the server side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14075) NPE while Edit Logging
[ https://issues.apache.org/jira/browse/HDFS-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704271#comment-16704271 ] Vinayakumar B commented on HDFS-14075: -- +1 > NPE while Edit Logging > -- > > Key: HDFS-14075 > URL: https://issues.apache.org/jira/browse/HDFS-14075 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Attachments: HDFS-14075-01.patch, HDFS-14075-02.patch, > HDFS-14075-03.patch, HDFS-14075-04.patch, HDFS-14075-04.patch, > HDFS-14075-04.patch, HDFS-14075-05.patch, HDFS-14075-06.patch, > HDFS-14075-07.patch > > > {noformat} > 2018-11-10 18:59:38,427 FATAL > org.apache.hadoop.hdfs.server.namenode.FSEditLog: Exception while edit > logging: null > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232) > at java.lang.Thread.run(Thread.java:745) > 2018-11-10 18:59:38,532 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: Exception while edit logging: null > 2018-11-10 18:59:38,552 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > SHUTDOWN_MSG: > {noformat} > Before NPE Received the following Exception > {noformat} > INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 65110, call > Call#23241 Retry#0 > org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from > > java.io.IOException: Unable to start log segment 7964819: too few journals > successfully started. > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1385) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegmentAndWriteHeaderTxn(FSEditLog.java:1395) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1319) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1352) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4669) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1293) > at > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146) > at > org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684) > Caused by: java.io.IOException: starting log segment 7964819 failed for too > many journals > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:412) > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:207) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1383) > ... 15 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13816) dfs.getQuotaUsage() throws NPE on non-existent dir instead of FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13816: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.3 3.2.1 3.3.0 3.0.4 Status: Resolved (was: Patch Available) > dfs.getQuotaUsage() throws NPE on non-existent dir instead of > FileNotFoundException > --- > > Key: HDFS-13816 > URL: https://issues.apache.org/jira/browse/HDFS-13816 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13816-01.patch, HDFS-13816-02.patch > > > {{dfs.getQuotaUsage()}} on non-existent path should throw > FileNotFoundException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getQuotaUsageInt(FSDirStatAndListingOp.java:573) > at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getQuotaUsage(FSDirStatAndListingOp.java:554) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getQuotaUsage(FSNamesystem.java:3221) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getQuotaUsage(NameNodeRpcServer.java:1404) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getQuotaUsage(ClientNamenodeProtocolServerSideTranslatorPB.java:1861) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13816) dfs.getQuotaUsage() throws NPE on non-existent dir instead of FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698890#comment-16698890 ] Vinayakumar B commented on HDFS-13816: -- Test failure is unrelated. Committed. Thanks [~brahmareddy], [~knanasi], [~shashikant] for reviews. > dfs.getQuotaUsage() throws NPE on non-existent dir instead of > FileNotFoundException > --- > > Key: HDFS-13816 > URL: https://issues.apache.org/jira/browse/HDFS-13816 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13816-01.patch, HDFS-13816-02.patch > > > {{dfs.getQuotaUsage()}} on non-existent path should throw > FileNotFoundException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getQuotaUsageInt(FSDirStatAndListingOp.java:573) > at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getQuotaUsage(FSDirStatAndListingOp.java:554) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getQuotaUsage(FSNamesystem.java:3221) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getQuotaUsage(NameNodeRpcServer.java:1404) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getQuotaUsage(ClientNamenodeProtocolServerSideTranslatorPB.java:1861) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13911) HDFS - Inconsistency in get and put syntax if filename/dirname contains space
[ https://issues.apache.org/jira/browse/HDFS-13911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698845#comment-16698845 ] Vinayakumar B commented on HDFS-13911: -- +1 > HDFS - Inconsistency in get and put syntax if filename/dirname contains space > - > > Key: HDFS-13911 > URL: https://issues.apache.org/jira/browse/HDFS-13911 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 3.1.1 >Reporter: vivek kumar >Assignee: Ayush Saxena >Priority: Minor > Attachments: HDFS-13911-01.patch, HDFS-13911-02.patch > > > Inconsistency in get and put syntax if file/fdir name contains space. > While copying file/dir from local to HDFS, space needs to be represented with > %20. However, the same representation does not work for copying file to > Local. Expectaion is to have same syntax for both get and put. > test:/ # mkdir /opt/ > test:/ # mkdir /opt/test\ space > test:/ # vi /opt/test\ space/test\ file.txt > test:/ # ll /opt/test\ space/ > total 4 > -rw-r--r-- 1 root root 7 Sep 12 18:37 test file.txt > test:/ # > *test:/ # hadoop fs -put /opt/test\ space/ /tmp/* > *put: unexpected URISyntaxException* > test:/ # > *test:/ # hadoop fs -put /opt/test%20space/ /tmp/* > test:/ # > test:/ # hadoop fs -ls /tmp > drwxr-xr-x - user1 hadoop 0 2018-09-12 18:38 /tmp/test space > test:/ # > *test:/ # hadoop fs -get /tmp/test%20space /srv/* > *get: `/tmp/test%20space': No such file or directory* > test:/ # > *test:/ # hadoop fs -get /tmp/test\ space /srv/* > test:/ # ll /srv/test\ space/ > total 4 > -rw-r--r-- 1 root root 7 Sep 12 18:39 test file.txt -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14075) NPE while Edit Logging
[ https://issues.apache.org/jira/browse/HDFS-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698582#comment-16698582 ] Vinayakumar B commented on HDFS-14075: -- {code:java} - throw new IOException("Unable to start log segment " + - segmentTxId + ": too few journals successfully started.", ex); + final String msg = "Unable to start log segment " + segmentTxId + + ": too few journals successfully started."; + LOG.error(msg, new Exception()); + synchronized (journalSetLock) { +IOUtils.cleanupWithLogger(LOG, journalSet); + } + terminate(1, msg); } {code} Here, original exception details are missed. So instead of {{new Exception()}} can log the same original {{ex}} for trace. bq. I think the question is when it fails to start a new segment in JNs, should the active NN terminates? or should it continues without syncing until some time the JNs are back? I think, current way of handling Journal failures in case of {{required}} journals is to provide "fail fast" mechanism. So in this case, where no operation is possible, its better to terminate instead of waiting to recover an *unknown* failure. So, I feel we stick with "terminate" mechanism. Coming back to patch, Other changes looks good. +1 once the above log changes are adopted > NPE while Edit Logging > -- > > Key: HDFS-14075 > URL: https://issues.apache.org/jira/browse/HDFS-14075 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Attachments: HDFS-14075-01.patch, HDFS-14075-02.patch, > HDFS-14075-03.patch, HDFS-14075-04.patch, HDFS-14075-04.patch, > HDFS-14075-04.patch, HDFS-14075-05.patch, HDFS-14075-06.patch > > > {noformat} > 2018-11-10 18:59:38,427 FATAL > org.apache.hadoop.hdfs.server.namenode.FSEditLog: Exception while edit > logging: null > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232) > at java.lang.Thread.run(Thread.java:745) > 2018-11-10 18:59:38,532 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: Exception while edit logging: null > 2018-11-10 18:59:38,552 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > SHUTDOWN_MSG: > {noformat} > Before NPE Received the following Exception > {noformat} > INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 65110, call > Call#23241 Retry#0 > org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from > > java.io.IOException: Unable to start log segment 7964819: too few journals > successfully started. > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1385) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegmentAndWriteHeaderTxn(FSEditLog.java:1395) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1319) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1352) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4669) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1293) > at > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146) > at > org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684) > Caused by: java.io.IOException: starting log segment 7964819 failed for too > many journals > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:412) > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:207) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1383) > ... 15 more > {noformat} -- This message was
[jira] [Updated] (HDFS-14056) Fix error messages in HDFS-12716
[ https://issues.apache.org/jira/browse/HDFS-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-14056: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.1 3.3.0 3.1.2 3.0.4 2.10.0 Status: Resolved (was: Patch Available) Committed. Thanks [~adam.antal] and [~ayushtkn]. > Fix error messages in HDFS-12716 > > > Key: HDFS-14056 > URL: https://issues.apache.org/jira/browse/HDFS-14056 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0, 3.2.0, 3.0.4, 3.1.2 >Reporter: Adam Antal >Assignee: Ayush Saxena >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14056-01.patch, HDFS-14056-02.patch > > > There are misleading error messages in the committed HDFS-12716 patch. > As I saw in the code in DataNode.java:startDataNode > {code:java} > throw new DiskErrorException("Invalid value configured for " > + "dfs.datanode.failed.volumes.tolerated - " + volFailuresTolerated > + ". Value configured is either greater than -1 or >= " > + "to the number of configured volumes (" + volsConfigured + ")."); > } > {code} > Here the error message seems a bit misleading. The error comes up when the > given quantity in the configuration set to volsConfigured is set lower than > -1 but in that case the error should say something like "Value configured is > either _less_ than -1 or >= ...". > Also the general error message in DataNode.java > {code:java} > public static final String MAX_VOLUME_FAILURES_TOLERATED_MSG = "should be > greater than -1"; > {code} > May be better changed to "should be greater than _or equal to_ -1" to be > precise, as -1 is a valid choice. > In hdfs-default.xml I couldn't understand the phrase "The range of the value > is -1 now, -1 represents the minimum of volume valids is 1." It might be > better to write something clearer like "The minimum is -1 representing 1 > valid remaining volume". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13963) NN UI is broken with IE11
[ https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13963: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.1 3.3.0 3.1.2 3.0.4 Status: Resolved (was: Patch Available) Committed to all branch-3.* Thanks all. > NN UI is broken with IE11 > - > > Key: HDFS-13963 > URL: https://issues.apache.org/jira/browse/HDFS-13963 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, ui >Affects Versions: 3.1.1 >Reporter: Daisuke Kobayashi >Assignee: Ayush Saxena >Priority: Minor > Labels: newbie > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, > HDFS-13963-02.patch, HDFS-13963-03.patch, Screen Shot 2018-10-05 at > 20.22.20.png, test-with-edge-mode.png > > > Internet Explorer 11 cannot correctly display Namenode Web UI while the NN > itself starts successfully. I have confirmed this over 3.1.1 (latest release) > and 3.3.0-SNAPSHOT (current trunk) that the following message is shown. > {code} > Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: > SyntaxError: Invalid character > {code} > Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default. > {code} > > {code} > Once the compatible mode is changed to IE11 through developer tool, it's > rendered correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13963) NN UI is broken with IE11
[ https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687678#comment-16687678 ] Vinayakumar B commented on HDFS-13963: -- +1, will commit later today > NN UI is broken with IE11 > - > > Key: HDFS-13963 > URL: https://issues.apache.org/jira/browse/HDFS-13963 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, ui >Affects Versions: 3.1.1 >Reporter: Daisuke Kobayashi >Assignee: Ayush Saxena >Priority: Minor > Labels: newbie > Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, > HDFS-13963-02.patch, HDFS-13963-03.patch, Screen Shot 2018-10-05 at > 20.22.20.png, test-with-edge-mode.png > > > Internet Explorer 11 cannot correctly display Namenode Web UI while the NN > itself starts successfully. I have confirmed this over 3.1.1 (latest release) > and 3.3.0-SNAPSHOT (current trunk) that the following message is shown. > {code} > Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: > SyntaxError: Invalid character > {code} > Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default. > {code} > > {code} > Once the compatible mode is changed to IE11 through developer tool, it's > rendered correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13911) HDFS - Inconsistency in get and put syntax if filename/dirname contains space
[ https://issues.apache.org/jira/browse/HDFS-13911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686432#comment-16686432 ] Vinayakumar B commented on HDFS-13911: -- [~ayushtkn], Thanks for the patch. I think change should fix the above said issue. Please include one test verifying the fix. > HDFS - Inconsistency in get and put syntax if filename/dirname contains space > - > > Key: HDFS-13911 > URL: https://issues.apache.org/jira/browse/HDFS-13911 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 3.1.1 >Reporter: vivek kumar >Assignee: Ayush Saxena >Priority: Minor > Attachments: HDFS-13911-01.patch > > > Inconsistency in get and put syntax if file/fdir name contains space. > While copying file/dir from local to HDFS, space needs to be represented with > %20. However, the same representation does not work for copying file to > Local. Expectaion is to have same syntax for both get and put. > test:/ # mkdir /opt/ > test:/ # mkdir /opt/test\ space > test:/ # vi /opt/test\ space/test\ file.txt > test:/ # ll /opt/test\ space/ > total 4 > -rw-r--r-- 1 root root 7 Sep 12 18:37 test file.txt > test:/ # > *test:/ # hadoop fs -put /opt/test\ space/ /tmp/* > *put: unexpected URISyntaxException* > test:/ # > *test:/ # hadoop fs -put /opt/test%20space/ /tmp/* > test:/ # > test:/ # hadoop fs -ls /tmp > drwxr-xr-x - user1 hadoop 0 2018-09-12 18:38 /tmp/test space > test:/ # > *test:/ # hadoop fs -get /tmp/test%20space /srv/* > *get: `/tmp/test%20space': No such file or directory* > test:/ # > *test:/ # hadoop fs -get /tmp/test\ space /srv/* > test:/ # ll /srv/test\ space/ > total 4 > -rw-r--r-- 1 root root 7 Sep 12 18:39 test file.txt -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13963) NN UI is broken with IE11
[ https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686427#comment-16686427 ] Vinayakumar B commented on HDFS-13963: -- [~daisuke.kobayashi] bq. Sorry for late here. I have checked if it works on my Windows env. in VM and found that the page is still broken with IE9 mode. Yes, this test is not same as verifying with "ie=egde". You have used IE-11 and changed the mode to IE-9 in developer tools. Changing to 'ie=edge' does not change anything (neither break more nor fix existing issue) on IE 9, but *it fixes the issue on later versions*. So I feel, this change in [^HDFS-13963-02.patch] is not an incompatible change, we can go ahead and commit it. [~elek], do you agree with this.? [~ayushtkn], please do similar change in the other HTML files of HDFS as well. > NN UI is broken with IE11 > - > > Key: HDFS-13963 > URL: https://issues.apache.org/jira/browse/HDFS-13963 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, ui >Affects Versions: 3.1.1 >Reporter: Daisuke Kobayashi >Assignee: Ayush Saxena >Priority: Minor > Labels: newbie > Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, > HDFS-13963-02.patch, Screen Shot 2018-10-05 at 20.22.20.png, > test-with-edge-mode.png > > > Internet Explorer 11 cannot correctly display Namenode Web UI while the NN > itself starts successfully. I have confirmed this over 3.1.1 (latest release) > and 3.3.0-SNAPSHOT (current trunk) that the following message is shown. > {code} > Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: > SyntaxError: Invalid character > {code} > Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default. > {code} > > {code} > Once the compatible mode is changed to IE11 through developer tool, it's > rendered correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14056) Fix error messages in HDFS-12716
[ https://issues.apache.org/jira/browse/HDFS-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686410#comment-16686410 ] Vinayakumar B commented on HDFS-14056: -- LGTM +1 > Fix error messages in HDFS-12716 > > > Key: HDFS-14056 > URL: https://issues.apache.org/jira/browse/HDFS-14056 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0, 3.2.0, 3.0.4, 3.1.2 >Reporter: Adam Antal >Assignee: Ayush Saxena >Priority: Minor > Attachments: HDFS-14056-01.patch, HDFS-14056-02.patch > > > There are misleading error messages in the committed HDFS-12716 patch. > As I saw in the code in DataNode.java:startDataNode > {code:java} > throw new DiskErrorException("Invalid value configured for " > + "dfs.datanode.failed.volumes.tolerated - " + volFailuresTolerated > + ". Value configured is either greater than -1 or >= " > + "to the number of configured volumes (" + volsConfigured + ")."); > } > {code} > Here the error message seems a bit misleading. The error comes up when the > given quantity in the configuration set to volsConfigured is set lower than > -1 but in that case the error should say something like "Value configured is > either _less_ than -1 or >= ...". > Also the general error message in DataNode.java > {code:java} > public static final String MAX_VOLUME_FAILURES_TOLERATED_MSG = "should be > greater than -1"; > {code} > May be better changed to "should be greater than _or equal to_ -1" to be > precise, as -1 is a valid choice. > In hdfs-default.xml I couldn't understand the phrase "The range of the value > is -1 now, -1 represents the minimum of volume valids is 1." It might be > better to write something clearer like "The minimum is -1 representing 1 > valid remaining volume". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13998) ECAdmin NPE with -setPolicy -replicate
[ https://issues.apache.org/jira/browse/HDFS-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683668#comment-16683668 ] Vinayakumar B commented on HDFS-13998: -- [~xiaochen] Isn't HDFS-13732 an incompatible change? It has changed the output of the command line. > ECAdmin NPE with -setPolicy -replicate > -- > > Key: HDFS-13998 > URL: https://issues.apache.org/jira/browse/HDFS-13998 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.2.0, 3.1.2 >Reporter: Xiao Chen >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13998.01.patch, HDFS-13998.02.patch, > HDFS-13998.03.patch > > > HDFS-13732 tried to improve the output of the console tool. But we missed the > fact that for replication, {{getErasureCodingPolicy}} would return null. > This jira is to fix it in ECAdmin, and add a unit test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13963) NN UI is broken with IE11
[ https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662085#comment-16662085 ] Vinayakumar B commented on HDFS-13963: -- {quote}From technical point of view this seems to be an incompatible change (don't know what is the target version) {quote} using {{IE-edge}} is not an incompatible change. It will just keep the same broken code as is in IE-9 :-) > NN UI is broken with IE11 > - > > Key: HDFS-13963 > URL: https://issues.apache.org/jira/browse/HDFS-13963 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, ui >Affects Versions: 3.1.1 >Reporter: Daisuke Kobayashi >Assignee: Ayush Saxena >Priority: Minor > Labels: newbie > Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, > HDFS-13963-02.patch, Screen Shot 2018-10-05 at 20.22.20.png, > test-with-edge-mode.png > > > Internet Explorer 11 cannot correctly display Namenode Web UI while the NN > itself starts successfully. I have confirmed this over 3.1.1 (latest release) > and 3.3.0-SNAPSHOT (current trunk) that the following message is shown. > {code} > Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: > SyntaxError: Invalid character > {code} > Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default. > {code} > > {code} > Once the compatible mode is changed to IE11 through developer tool, it's > rendered correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657098#comment-16657098 ] Vinayakumar B commented on HDFS-13983: -- Updated the patch again. Thanks [~elgoiri] for reviews. > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch, HDFS-13983-02.patch, HDFS-13983-03.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13983: - Attachment: HDFS-13983-03.patch > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch, HDFS-13983-02.patch, HDFS-13983-03.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14003) Fix findbugs warning in trunk
[ https://issues.apache.org/jira/browse/HDFS-14003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655639#comment-16655639 ] Vinayakumar B commented on HDFS-14003: -- +1 > Fix findbugs warning in trunk > - > > Key: HDFS-14003 > URL: https://issues.apache.org/jira/browse/HDFS-14003 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Yiqun Lin >Assignee: Yiqun Lin >Priority: Major > Attachments: HDFS-14003.001.patch > > > There is a findbugs warning generated in trunk recently. > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25298/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html] > Looks like this is generated after this > commit:[https://github.com/apache/hadoop/commit/b60ca37914b22550e3630fa02742d40697decb31#diff-116c9c55048a5e9df753f219c4b3f233] > We can make a clean for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14002) TestLayoutVersion#testNameNodeFeatureMinimumCompatibleLayoutVersions fails
[ https://issues.apache.org/jira/browse/HDFS-14002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654655#comment-16654655 ] Vinayakumar B commented on HDFS-14002: -- +1, pending jenkins > TestLayoutVersion#testNameNodeFeatureMinimumCompatibleLayoutVersions fails > -- > > Key: HDFS-14002 > URL: https://issues.apache.org/jira/browse/HDFS-14002 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Critical > Attachments: HDFS-14002.1.patch > > > This is the error log. > {noformat} > java.lang.AssertionError: Expected feature EXPANDED_STRING_TABLE to have > minimum compatible layout version set to itself. expected:<-65> but was:<-61> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.protocol.TestLayoutVersion.testNameNodeFeatureMinimumCompatibleLayoutVersions(TestLayoutVersion.java:141) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650006#comment-16650006 ] Vinayakumar B commented on HDFS-13983: -- Updated the patch for checkstyles fixed. > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch, HDFS-13983-02.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13983: - Attachment: HDFS-13983-02.patch > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch, HDFS-13983-02.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky
[ https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13945: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 3.1.2 3.2.0 Status: Resolved (was: Patch Available) Committed to trunk, branch-3.2 and branch-3.1 > TestDataNodeVolumeFailure is Flaky > -- > > Key: HDFS-13945 > URL: https://issues.apache.org/jira/browse/HDFS-13945 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.2.0, 3.1.2, 3.3.0 > > Attachments: HDFS-13945-01.patch, HDFS-13945-02.patch, > HDFS-13945-03.patch > > > The test is failing in trunk since long. > Reference - > [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > > Stack Trace - > > Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 > 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 > timed_waiting java.lang.Thread.State: TIMED_WAITING at > sun.misc.Unsafe.park(Native Method) at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server > handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting > java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) > "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 > tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at > java.lang.Thread.sleep(Native Method) at > org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) > at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 > tid=2766 runnable java.lang.Thread.State: RUNNABLE at > sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at > org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at > org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) > "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon > prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at > sun.misc.Unsafe.park(Native Method) at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 > tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at > sun.misc.Unsafe.park(Native Method) at >
[jira] [Updated] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack
[ https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13156: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 3.2.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-3.2 > HDFS Block Placement Policy - Client Local Rack > --- > > Key: HDFS-13156 > URL: https://issues.apache.org/jira/browse/HDFS-13156 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.9.0, 3.2.0, 3.1.1 >Reporter: BELUGA BEHR >Assignee: Ayush Saxena >Priority: Minor > Fix For: 3.2.0, 3.3.0 > > Attachments: HDFS-13156-01.patch > > > {quote}For the common case, when the replication factor is three, HDFS’s > placement policy is to put one replica on the local machine if the writer is > on a datanode, otherwise on a random datanode, another replica on a node in a > different (remote) rack, and the last on a different node in the same remote > rack. > {quote} > [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps] > Having just looked over the Default Block Placement code, the way I > understand this, is that, there are three basic scenarios: > # HDFS client is running on a datanode inside the cluster > # HDFS client is running on a node outside the cluster > # HDFS client is running on a non-datanode inside the cluster > The documentation is ambiguous concerning the third scenario. Please correct > me if I'm wrong, but the way I understand the code, if there is an HDFS > client inside the cluster, but it is not on a datanode, the first block will > be placed on a datanode within the set of datanodes available on the local > rack and not simply on any _random datanode_ from the set of all datanodes in > the cluster. > That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, > I should expect that every first block will be written to a _random datanode_ > on the same rack as the HDFS Flume agent, assuming the network topology > script is written to include this Flume node. > If that is correct, can the documentation be updated to include this third > common scenario? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
[ https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13906: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-13891 Status: Resolved (was: Patch Available) Created the version HDFS-13891 in Jira as well. > RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands > --- > > Key: HDFS-13906 > URL: https://issues.apache.org/jira/browse/HDFS-13906 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Reporter: Soumyapn >Assignee: Ayush Saxena >Priority: Major > Labels: RBF > Fix For: HDFS-13891 > > Attachments: HDFS-13906-01.patch, HDFS-13906-02.patch, > HDFS-13906-03.patch, HDFS-13906-04.patch > > > Currently we have option to delete only one mount entry at once. > If we have multiple mount entries, then it would be difficult for the user to > execute the command for N number of times. > Better If the "rm" and "clrQuota" command supports multiple entries, then It > would be easy for the user to provide all the required entries in one single > command. > Namenode is already suporting "rm" and "clrQuota" with multiple destinations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
[ https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647811#comment-16647811 ] Vinayakumar B commented on HDFS-13906: -- +1, LGTM, Committed to branch HDFS-13891, handled the checkstyle comment during commit. > RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands > --- > > Key: HDFS-13906 > URL: https://issues.apache.org/jira/browse/HDFS-13906 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Reporter: Soumyapn >Assignee: Ayush Saxena >Priority: Major > Labels: RBF > Attachments: HDFS-13906-01.patch, HDFS-13906-02.patch, > HDFS-13906-03.patch, HDFS-13906-04.patch > > > Currently we have option to delete only one mount entry at once. > If we have multiple mount entries, then it would be difficult for the user to > execute the command for N number of times. > Better If the "rm" and "clrQuota" command supports multiple entries, then It > would be easy for the user to provide all the required entries in one single > command. > Namenode is already suporting "rm" and "clrQuota" with multiple destinations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack
[ https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647463#comment-16647463 ] Vinayakumar B commented on HDFS-13156: -- +1 > HDFS Block Placement Policy - Client Local Rack > --- > > Key: HDFS-13156 > URL: https://issues.apache.org/jira/browse/HDFS-13156 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.9.0, 3.2.0, 3.1.1 >Reporter: BELUGA BEHR >Assignee: Ayush Saxena >Priority: Minor > Attachments: HDFS-13156-01.patch > > > {quote}For the common case, when the replication factor is three, HDFS’s > placement policy is to put one replica on the local machine if the writer is > on a datanode, otherwise on a random datanode, another replica on a node in a > different (remote) rack, and the last on a different node in the same remote > rack. > {quote} > [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps] > Having just looked over the Default Block Placement code, the way I > understand this, is that, there are three basic scenarios: > # HDFS client is running on a datanode inside the cluster > # HDFS client is running on a node outside the cluster > # HDFS client is running on a non-datanode inside the cluster > The documentation is ambiguous concerning the third scenario. Please correct > me if I'm wrong, but the way I understand the code, if there is an HDFS > client inside the cluster, but it is not on a datanode, the first block will > be placed on a datanode within the set of datanodes available on the local > rack and not simply on any _random datanode_ from the set of all datanodes in > the cluster. > That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, > I should expect that every first block will be written to a _random datanode_ > on the same rack as the HDFS Flume agent, assuming the network topology > script is written to include this Flume node. > If that is correct, can the documentation be updated to include this third > common scenario? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13945) TestDataNodeVolumeFailure is Flaky
[ https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647456#comment-16647456 ] Vinayakumar B commented on HDFS-13945: -- bq. As far as I think the reason of failure was this only here the conversion to pendingReconstruction in 3 seconds by the RedundancyMonitor thread. Ensuring the volume I guess is not required because there are two blocks being written so one will go in the first volume for sure. I concluded that it was not there as it was not underReplicated and even not the replica for that in that vol was present so I thought it killed nothing but it was pendingReconstruction Thanks [~ayushtkn] for the deep dive. Analysis looks good. +1, fix looks good to me. Will commit later today. > TestDataNodeVolumeFailure is Flaky > -- > > Key: HDFS-13945 > URL: https://issues.apache.org/jira/browse/HDFS-13945 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-13945-01.patch, HDFS-13945-02.patch, > HDFS-13945-03.patch > > > The test is failing in trunk since long. > Reference - > [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > [https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/] > > > Stack Trace - > > Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 > 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 > timed_waiting java.lang.Thread.State: TIMED_WAITING at > sun.misc.Unsafe.park(Native Method) at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server > handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting > java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) > "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 > tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at > java.lang.Thread.sleep(Native Method) at > org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) > at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 > tid=2766 runnable java.lang.Thread.State: RUNNABLE at > sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at > org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at > org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) > "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon > prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at > sun.misc.Unsafe.park(Native Method) at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at >
[jira] [Commented] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646046#comment-16646046 ] Vinayakumar B commented on HDFS-13983: -- Attached the screenshots with and without patch. The reason for avoiding {{system.out}} close() is because Intellij was unable to run tests, with other fixes. mvn run was success through with sysout close. !HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG! > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13983: - Attachment: HDFS-13893-without-patch-mvn.JPG HDFS-13893-without-patch-intellij-idea.JPG HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG HDFS-13893-with-patch-mvn.JPG HDFS-13893-with-patch-intellij-idea.JPG > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13893-with-patch-intellij-idea.JPG, > HDFS-13893-with-patch-mvn.JPG, > HDFS-13893-with-patch-without-sysout-close-intellij-idea.JPG, > HDFS-13893-without-patch-intellij-idea.JPG, HDFS-13893-without-patch-mvn.JPG, > HDFS-13983-01.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13983: - Status: Patch Available (was: Open) > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13983-01.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13983) TestOfflineImageViewer crashes in windows
[ https://issues.apache.org/jira/browse/HDFS-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-13983: - Attachment: HDFS-13983-01.patch > TestOfflineImageViewer crashes in windows > - > > Key: HDFS-13983 > URL: https://issues.apache.org/jira/browse/HDFS-13983 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS-13983-01.patch > > > TestOfflineImageViewer crashes in windows because, OfflineImageViewer > REVERSEXML tries to delete the outputfile and re-create the same stream which > is already created. > Also there are unclosed RAF for input files which blocks from files being > deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org