[jira] [Assigned] (HDFS-2932) Under replicated block after the pipeline recovery.
[ https://issues.apache.org/jira/browse/HDFS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Upputuri reassigned HDFS-2932: --- Assignee: Srikanth Upputuri Under replicated block after the pipeline recovery. --- Key: HDFS-2932 URL: https://issues.apache.org/jira/browse/HDFS-2932 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 0.24.0 Reporter: J.Andreina Assignee: Srikanth Upputuri Fix For: 0.24.0 Started 1NN,DN1,DN2,DN3 in the same machine. Written a huge file of size 2 Gb while the write for the block-id-1005 is in progress bruought down DN3. after the pipeline recovery happened.Block stamp changed into block_id_1006 in DN1,Dn2. after the write is over.DN3 is brought up and fsck command is issued. the following mess is displayed as follows block-id_1006 is underreplicatede.Target replicas is 3 but found 2 replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7088) Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes
Tsz Wo Nicholas Sze created HDFS-7088: - Summary: Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Key: HDFS-7088 URL: https://issues.apache.org/jira/browse/HDFS-7088 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor {noformat} java.lang.AssertionError: expected:0 but was:-3 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:163) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:365) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer(TestBalancerWithMultipleNameNodes.java:379) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-2932) Under replicated block after the pipeline recovery.
[ https://issues.apache.org/jira/browse/HDFS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Upputuri resolved HDFS-2932. - Resolution: Duplicate Fix Version/s: (was: 0.24.0) Closed as duplicate of HDFS-3493. Under replicated block after the pipeline recovery. --- Key: HDFS-2932 URL: https://issues.apache.org/jira/browse/HDFS-2932 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 0.24.0 Reporter: J.Andreina Assignee: Srikanth Upputuri Started 1NN,DN1,DN2,DN3 in the same machine. Written a huge file of size 2 Gb while the write for the block-id-1005 is in progress bruought down DN3. after the pipeline recovery happened.Block stamp changed into block_id_1006 in DN1,Dn2. after the write is over.DN3 is brought up and fsck command is issued. the following mess is displayed as follows block-id_1006 is underreplicatede.Target replicas is 3 but found 2 replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6824) Additional user documentation for HDFS encryption.
[ https://issues.apache.org/jira/browse/HDFS-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138572#comment-14138572 ] Andrew Wang commented on HDFS-6824: --- Also need to fix [~yoderme]'s comment from HDFS-6394: https://issues.apache.org/jira/browse/HDFS-6394?focusedCommentId=14087313page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14087313 Someone else also mentioned that we should emphasize that data isn't transparently encrypted on HDFS upgrade, and needs to be copied in to an EZ. I'll do this too. Additional user documentation for HDFS encryption. -- Key: HDFS-6824 URL: https://issues.apache.org/jira/browse/HDFS-6824 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor We'd like to better document additional things about HDFS encryption: setup and configuration, using alternate access methods (namely WebHDFS and HttpFS), other misc improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7077) Separate CipherSuite from crypto protocol version
[ https://issues.apache.org/jira/browse/HDFS-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-7077: - Assignee: Andrew Wang Separate CipherSuite from crypto protocol version - Key: HDFS-7077 URL: https://issues.apache.org/jira/browse/HDFS-7077 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Right now the CipherSuite is used for protocol version negotiation, which is wrong. We need to separate it out. An EZ should be locked to a certain CipherSuite and protocol version. A client reading and writing to the EZ then needs to negotiate based on both of these parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6727: Attachment: HDFS-6727.007.patch Update patch to fix findbugs reports. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.004.patch, HDFS-6727.005.patch, HDFS-6727.006.patch, HDFS-6727.006.patch, HDFS-6727.007.patch, HDFS-6727.combo.patch, patchFindBugsOutputhadoop-hdfs.txt HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7073) Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases.
[ https://issues.apache.org/jira/browse/HDFS-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138634#comment-14138634 ] Yi Liu commented on HDFS-7073: -- Hi [~cnauroth], nice work. {quote} DataNode: There had been some mishandling in checkSecureConfig around checking the dfs.data.tranfser.protection property. It's defined in hdfs-default.xml, so it always comes in with empty string as the default (not null). I changed some of this logic to check for empty string instead of null. {quote} That's great for this fix too, otherwise if cluster is security enabled and we still can start DN listened on an unprivileged port ( 1024) even {{dfs.data.transfer.protection}} is empty. {quote} Cluster is unsecured, but has block access tokens enabled. This is not something I've seen done in practice, but I've heard historically it has been allowed. The HDFS-2856 code relied on seeing an empty block access token to trigger fallback, and this doesn't work if the unsecured cluster actually is using block access tokens. {quote} In the patch, fallback for writeblock is handled, but fallback for readblock is not handled. The test case for this scenario is hard to write because {{UserGroupInformation#isSecurityEnabled()}} is static, so we can't configure client secured but server unsecured. But I just have this environment and test this scenario, I configured: server(unsecured and block access tokens enabled), client (secure enabled, block access tokens enabled and fallback enabled). I see write file is successful, but *read file failed*. Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases. -- Key: HDFS-7073 URL: https://issues.apache.org/jira/browse/HDFS-7073 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client, security Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-7073.1.patch HDFS-2856 implemented general SASL support on DataTransferProtocol. Part of that work also included a fallback mode in case the remote cluster is running under a different configuration without SASL. I've discovered a few edge case configurations that this did not support: * Cluster is unsecured, but has block access tokens enabled. This is not something I've seen done in practice, but I've heard historically it has been allowed. The HDFS-2856 code relied on seeing an empty block access token to trigger fallback, and this doesn't work if the unsecured cluster actually is using block access tokens. * The DataNode has an unpublicized testing configuration property that could be used to skip the privileged port check. However, the HDFS-2856 code is still enforcing requirement of SASL when the ports are not privileged, so this would force existing configurations to make changes to activate SASL. This patch will restore the old behavior so that these edge case configurations will continue to work the same way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7073) Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases.
[ https://issues.apache.org/jira/browse/HDFS-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138639#comment-14138639 ] Yi Liu commented on HDFS-7073: -- For the first comment, I want to add: even though follow-on sasl handshake would failed, but the error log user see is not explicit. So it's pretty good of the fix for *not* let DN start successful on unprivileged port if {{dfs.data.transfer.protection}} is empty. Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases. -- Key: HDFS-7073 URL: https://issues.apache.org/jira/browse/HDFS-7073 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client, security Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-7073.1.patch HDFS-2856 implemented general SASL support on DataTransferProtocol. Part of that work also included a fallback mode in case the remote cluster is running under a different configuration without SASL. I've discovered a few edge case configurations that this did not support: * Cluster is unsecured, but has block access tokens enabled. This is not something I've seen done in practice, but I've heard historically it has been allowed. The HDFS-2856 code relied on seeing an empty block access token to trigger fallback, and this doesn't work if the unsecured cluster actually is using block access tokens. * The DataNode has an unpublicized testing configuration property that could be used to skip the privileged port check. However, the HDFS-2856 code is still enforcing requirement of SASL when the ports are not privileged, so this would force existing configurations to make changes to activate SASL. This patch will restore the old behavior so that these edge case configurations will continue to work the same way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7088) Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes
[ https://issues.apache.org/jira/browse/HDFS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7088: -- Attachment: h7088_20140918.patch The failures are because of calling hflush() when writing id file. h7088_20140918.patch: do not write to id file for unit tests. Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Key: HDFS-7088 URL: https://issues.apache.org/jira/browse/HDFS-7088 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h7088_20140918.patch {noformat} java.lang.AssertionError: expected:0 but was:-3 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:163) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:365) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer(TestBalancerWithMultipleNameNodes.java:379) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6584: -- Attachment: h6584_20140918.patch h6584_20140918.patch: with HDFS-7088. Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138658#comment-14138658 ] Hadoop QA commented on HDFS-6584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669669/h6584_20140918.patch against trunk revision ee21b13. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8076//console This message is automatically generated. Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138691#comment-14138691 ] Yi Liu commented on HDFS-6606: -- Rebase the patch for latest trunk. [~usrikanth], Jaas GSSAPI mechanism indeed supports AES, but it's not suitable here, client need to make sure the DN is legal too. For DIGEST-MD5, the password is generated using the accessToken or encryption key, by this way, DN can validate client and the client can also validate whether DN is legal (ensure {{block access token}} not got by malicious process). But for GSSAPI mechanism, we can't ensure this and have performance issue. Another reason is that not all users could use third-party JCE provider; if using CryptoCodec, it's scalable and have built-in support for AES-NI in Hadoop. Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, HDFS-6606.003.patch, HDFS-6606.004.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Attachment: HDFS-6606.005.patch Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138692#comment-14138692 ] Hadoop QA commented on HDFS-6970: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669643/hdfs-6970.001.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8074//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8074//console This message is automatically generated. Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7073) Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases.
[ https://issues.apache.org/jira/browse/HDFS-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138694#comment-14138694 ] Yi Liu commented on HDFS-7073: -- One security issue I can think is: If we allow this type of fallback, as discussed in HDFS-2856 about the attack vector, a malicious task can easily listen on the DN's port after it dies and steal the block access token. So we'd better not allow the fallback? Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases. -- Key: HDFS-7073 URL: https://issues.apache.org/jira/browse/HDFS-7073 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client, security Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-7073.1.patch HDFS-2856 implemented general SASL support on DataTransferProtocol. Part of that work also included a fallback mode in case the remote cluster is running under a different configuration without SASL. I've discovered a few edge case configurations that this did not support: * Cluster is unsecured, but has block access tokens enabled. This is not something I've seen done in practice, but I've heard historically it has been allowed. The HDFS-2856 code relied on seeing an empty block access token to trigger fallback, and this doesn't work if the unsecured cluster actually is using block access tokens. * The DataNode has an unpublicized testing configuration property that could be used to skip the privileged port check. However, the HDFS-2856 code is still enforcing requirement of SASL when the ports are not privileged, so this would force existing configurations to make changes to activate SASL. This patch will restore the old behavior so that these edge case configurations will continue to work the same way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6584: -- Attachment: h6584_20140918b.patch h6584_20140918b.patch: excludes hdfs.cmd since the patch command does not work with dos file correctly. Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, h6584_20140918b.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138726#comment-14138726 ] Hadoop QA commented on HDFS-6727: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669653/HDFS-6727.007.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8075//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8075//console This message is automatically generated. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.004.patch, HDFS-6727.005.patch, HDFS-6727.006.patch, HDFS-6727.006.patch, HDFS-6727.007.patch, HDFS-6727.combo.patch, patchFindBugsOutputhadoop-hdfs.txt HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7086) httpfs create files default overwrite behavior is set to true
[ https://issues.apache.org/jira/browse/HDFS-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-7086: - Component/s: documentation httpfs create files default overwrite behavior is set to true - Key: HDFS-7086 URL: https://issues.apache.org/jira/browse/HDFS-7086 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.0.0-alpha, 2.1.0-beta, 2.2.0, 2.3.0, 2.4.1, 2.5.1 Environment: Linux, Java Reporter: Eric Yang WebHDFS documentation says overwrite flag is default to false, but httpfs set the flag to true by default. This can be different from user's expectation and cause data to be overwritten. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7082) When replication factor equals number of data nodes, corrupt replica will never get substituted with good replica
[ https://issues.apache.org/jira/browse/HDFS-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7082 started by Srikanth Upputuri. --- When replication factor equals number of data nodes, corrupt replica will never get substituted with good replica - Key: HDFS-7082 URL: https://issues.apache.org/jira/browse/HDFS-7082 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Srikanth Upputuri Assignee: Srikanth Upputuri Priority: Minor BlockManager will not invalidate a corrupt replica if this brings down the total number of replicas below replication factor (except if the corrupt replica has a wrong genstamp). On clusters where the replication factor = total data nodes, a new replica can not be created from a live replica as all the available datanodes already have a replica each. Because of this, the corrupt replicas will never be substituted with good replicas, so will never get deleted. Sooner or later all replicas may get corrupt and there will be no live replicas in the cluster for this block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7086) httpfs create files default overwrite behavior is set to true
[ https://issues.apache.org/jira/browse/HDFS-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138755#comment-14138755 ] Steve Loughran commented on HDFS-7086: -- This behaviour is consistent with HDFS and other implementations of {{FileSystem}}, because {{create(Path)}} defaults to overwrite {code} /** * Create an FSDataOutputStream at the indicated Path. * Files are overwritten by default. * @param f the file to create */ public FSDataOutputStream create(Path f) throws IOException { return create(f, true); } {code} Looking at the filesystem.md and {{AbstractContractCreateTest}}, I don't see where this is explicitly called out or tested for. Doing both of these would ensure that when someone got round to doing contract tests for WebHDFS its consistency with HDFS can be validated. Tagging as a documentation test httpfs create files default overwrite behavior is set to true - Key: HDFS-7086 URL: https://issues.apache.org/jira/browse/HDFS-7086 Project: Hadoop HDFS Issue Type: Bug Components: documentation, test Affects Versions: 2.0.0-alpha, 2.1.0-beta, 2.2.0, 2.3.0, 2.4.1, 2.5.1 Environment: Linux, Java Reporter: Eric Yang WebHDFS documentation says overwrite flag is default to false, but httpfs set the flag to true by default. This can be different from user's expectation and cause data to be overwritten. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7086) httpfs create files default overwrite behavior is set to true
[ https://issues.apache.org/jira/browse/HDFS-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-7086: - Component/s: test httpfs create files default overwrite behavior is set to true - Key: HDFS-7086 URL: https://issues.apache.org/jira/browse/HDFS-7086 Project: Hadoop HDFS Issue Type: Bug Components: documentation, test Affects Versions: 2.0.0-alpha, 2.1.0-beta, 2.2.0, 2.3.0, 2.4.1, 2.5.1 Environment: Linux, Java Reporter: Eric Yang WebHDFS documentation says overwrite flag is default to false, but httpfs set the flag to true by default. This can be different from user's expectation and cause data to be overwritten. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.006.patch Hi, [~cmccabe], thanks for your great suggestions. Async API makes more sense. I've changed the patch to reflect the discussions. In summary, this patch * Changes the Reconfiguration framework from HDFS-7001. It adds {{ReconfigurableBase#startReconfigureTask()}} which starts a background thread to do configuration reloading, so that it supports async API. Also it checks whether there is an active task is running, if so it returns errors. * Provides CLI command (similar to {{btrfs scrub start|status}} to start and query the status of the reconfiguration work. {noformat} dfsadmin -reconfig -datanode [start|status] host:port {noformat} But no {{-reconfig cancel}} is provided, because there is not an obvious way for me to interrupt the reconfiguration process while ensures {{DN}} consistent. Maybe we can fix it later. * The protobuf protocol for {{-reconfig status}} is basically returning a {{Mapconf change, error message}}, with task start and/or end times. It is the caller's (i.e., {{DFSAdmin}}) responsibility to print these error messages, so that it can generate CLI messages, XML, HTML... Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138764#comment-14138764 ] Hadoop QA commented on HDFS-6808: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669693/HDFS-6808.006.patch against trunk revision ee21b13. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8080//console This message is automatically generated. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.006.combo.patch Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138808#comment-14138808 ] Hudson commented on HDFS-6705: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) HDFS-6705. Create an XAttr that disallows the HDFS admin from accessing a file. (clamb via wang) (wang: rev ea4e2e843ecadd8019ea35413f4a34b97a424923) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSXAttrBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrPermissionFilter.java * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testXAttrConf.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch, HDFS-6705.003.patch, HDFS-6705.004.patch, HDFS-6705.005.patch, HDFS-6705.006.patch, HDFS-6705.007.patch, HDFS-6705.008.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7075) hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory
[ https://issues.apache.org/jira/browse/HDFS-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138812#comment-14138812 ] Hudson commented on HDFS-7075: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) HDFS-7075. hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory. (cmccabe) (cmccabe: rev f23024852502441fc259012664e444e5e51c604a) * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderFactory.java hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory - Key: HDFS-7075 URL: https://issues.apache.org/jira/browse/HDFS-7075 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HDFS-7075.001.patch hadoop-fuse-dfs fails complaining with: {code} java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found {code} Here is an example of the hadoop-fuse-dfs debug output. {code} 14/09/04 13:49:04 WARN crypto.CryptoCodec: Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://hdfs-cdh5-secure-1.vpc.cloudera.com:8020, port=0, kerbTicketCachePath=/tmp/krb5cc_0, userName=root) error: java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found at java.util.ServiceLoader.fail(ServiceLoader.java:231) at java.util.ServiceLoader.access$300(ServiceLoader.java:181) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:365) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7004) Update KeyProvider instantiation to create by URI
[ https://issues.apache.org/jira/browse/HDFS-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138816#comment-14138816 ] Hudson commented on HDFS-7004: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) HDFS-7004. Update KeyProvider instantiation to create by URI. (wang) (wang: rev 10e8602f32b553a1424f1a9b5f9f74f7b68a49d1) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZonesWithHA.java * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java * hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReservedRawPaths.java * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/MiniKMS.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSConfiguration.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/cli/TestCryptoAdminCLI.java * hadoop-common-project/hadoop-kms/src/main/conf/kms-site.xml * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSWebApp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/TransparentEncryption.apt.vm Update KeyProvider instantiation to create by URI - Key: HDFS-7004 URL: https://issues.apache.org/jira/browse/HDFS-7004 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7004.001.patch, hdfs-7004.002.patch, hdfs-7004.004.patch See HADOOP-11054, would be good to update the NN/DFSClient to fetch via this method rather than depending on the URI path lookup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method
[ https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138811#comment-14138811 ] Hudson commented on HDFS-6843: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) HDFS-6843. Create FileStatus isEncrypted() method (clamb via cmccabe) (cmccabe: rev e3803d002c660f18a5c2ecf32344fd6f3f491a5b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsPermissionExtension.java * hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsAclPermission.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java HDFS-6843. Add to CHANGES.txt (cmccabe: rev f24ac429d102777fe021e9852cfff38312643512) * hadoop-common-project/hadoop-common/CHANGES.txt Create FileStatus isEncrypted() method -- Key: HDFS-6843 URL: https://issues.apache.org/jira/browse/HDFS-6843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch, HDFS-6843.008.patch, HDFS-6843.009.patch, HDFS-6843.010.patch FileStatus should have a 'boolean isEncrypted()' method. (it was in the context of discussing with AndreW about FileStatus being a Writable). Having this method would allow MR JobSubmitter do the following: - BOOLEAN intermediateEncryption = false IF jobconf.contains(mr.intermidate.encryption) THEN intermediateEncryption = jobConf.getBoolean(mr.intermidate.encryption) ELSE IF (I/O)Format INSTANCEOF File(I/O)Format THEN intermediateEncryption = ANY File(I/O)Format HAS a Path with status isEncrypted()==TRUE FI jobConf.setBoolean(mr.intermidate.encryption, intermediateEncryption) FI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7078) Fix listEZs to work correctly with snapshots
[ https://issues.apache.org/jira/browse/HDFS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138815#comment-14138815 ] Hudson commented on HDFS-7078: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) HDFS-7078. Fix listEZs to work correctly with snapshots. (wang) (wang: rev 0ecefe60179968984b1892a14411566b7a0c8df3) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java Fix listEZs to work correctly with snapshots Key: HDFS-7078 URL: https://issues.apache.org/jira/browse/HDFS-7078 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7078.001.patch, hdfs-7078.002.patch listEZs will list encryption zones that are only present in a snapshot, rather than only the EZs in the current filesystem state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138825#comment-14138825 ] Hadoop QA commented on HDFS-6606: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669675/HDFS-6606.005.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8077//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8077//console This message is automatically generated. Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6995) Block should be placed in the client's 'rack-local' node if 'client-local' node is not available
[ https://issues.apache.org/jira/browse/HDFS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138913#comment-14138913 ] Hadoop QA commented on HDFS-6995: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/1262/HDFS-6995-002.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1265 javac compiler warnings (more than the trunk's current 677 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 109 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/8079//artifact/trunk/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.crypto.random.TestOsSecureRandom org.apache.hadoop.ha.TestZKFailoverControllerStress org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestMetaSave The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8079//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8079//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8079//console This message is automatically generated. Block should be placed in the client's 'rack-local' node if 'client-local' node is not available Key: HDFS-6995 URL: https://issues.apache.org/jira/browse/HDFS-6995 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6995-001.patch, HDFS-6995-002.patch HDFS cluster is rack aware. Client is in different node than of datanode, but Same rack contains one or more datanodes. In this case first preference should be given to select 'rack-local' node. Currently, since no Node in clusterMap corresponds to client's location, blockplacement policy choosing a *random* node as local node and proceeding for further placements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138914#comment-14138914 ] Hadoop QA commented on HDFS-6808: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669695/HDFS-6808.006.combo.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.crypto.random.TestOsSecureRandom org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8081//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8081//console This message is automatically generated. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7075) hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory
[ https://issues.apache.org/jira/browse/HDFS-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138940#comment-14138940 ] Hudson commented on HDFS-7075: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) HDFS-7075. hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory. (cmccabe) (cmccabe: rev f23024852502441fc259012664e444e5e51c604a) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderFactory.java * hadoop-common-project/hadoop-common/CHANGES.txt hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory - Key: HDFS-7075 URL: https://issues.apache.org/jira/browse/HDFS-7075 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HDFS-7075.001.patch hadoop-fuse-dfs fails complaining with: {code} java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found {code} Here is an example of the hadoop-fuse-dfs debug output. {code} 14/09/04 13:49:04 WARN crypto.CryptoCodec: Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://hdfs-cdh5-secure-1.vpc.cloudera.com:8020, port=0, kerbTicketCachePath=/tmp/krb5cc_0, userName=root) error: java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found at java.util.ServiceLoader.fail(ServiceLoader.java:231) at java.util.ServiceLoader.access$300(ServiceLoader.java:181) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:365) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7004) Update KeyProvider instantiation to create by URI
[ https://issues.apache.org/jira/browse/HDFS-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138944#comment-14138944 ] Hudson commented on HDFS-7004: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) HDFS-7004. Update KeyProvider instantiation to create by URI. (wang) (wang: rev 10e8602f32b553a1424f1a9b5f9f74f7b68a49d1) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/MiniKMS.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-kms/src/main/conf/kms-site.xml * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSWebApp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSConfiguration.java * hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReservedRawPaths.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZonesWithHA.java * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/TransparentEncryption.apt.vm * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/cli/TestCryptoAdminCLI.java * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java Update KeyProvider instantiation to create by URI - Key: HDFS-7004 URL: https://issues.apache.org/jira/browse/HDFS-7004 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7004.001.patch, hdfs-7004.002.patch, hdfs-7004.004.patch See HADOOP-11054, would be good to update the NN/DFSClient to fetch via this method rather than depending on the URI path lookup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method
[ https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138939#comment-14138939 ] Hudson commented on HDFS-6843: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) HDFS-6843. Create FileStatus isEncrypted() method (clamb via cmccabe) (cmccabe: rev e3803d002c660f18a5c2ecf32344fd6f3f491a5b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsAclPermission.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsPermissionExtension.java * hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md HDFS-6843. Add to CHANGES.txt (cmccabe: rev f24ac429d102777fe021e9852cfff38312643512) * hadoop-common-project/hadoop-common/CHANGES.txt Create FileStatus isEncrypted() method -- Key: HDFS-6843 URL: https://issues.apache.org/jira/browse/HDFS-6843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch, HDFS-6843.008.patch, HDFS-6843.009.patch, HDFS-6843.010.patch FileStatus should have a 'boolean isEncrypted()' method. (it was in the context of discussing with AndreW about FileStatus being a Writable). Having this method would allow MR JobSubmitter do the following: - BOOLEAN intermediateEncryption = false IF jobconf.contains(mr.intermidate.encryption) THEN intermediateEncryption = jobConf.getBoolean(mr.intermidate.encryption) ELSE IF (I/O)Format INSTANCEOF File(I/O)Format THEN intermediateEncryption = ANY File(I/O)Format HAS a Path with status isEncrypted()==TRUE FI jobConf.setBoolean(mr.intermidate.encryption, intermediateEncryption) FI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138936#comment-14138936 ] Hudson commented on HDFS-6705: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) HDFS-6705. Create an XAttr that disallows the HDFS admin from accessing a file. (clamb via wang) (wang: rev ea4e2e843ecadd8019ea35413f4a34b97a424923) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrPermissionFilter.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testXAttrConf.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSXAttrBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch, HDFS-6705.003.patch, HDFS-6705.004.patch, HDFS-6705.005.patch, HDFS-6705.006.patch, HDFS-6705.007.patch, HDFS-6705.008.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7078) Fix listEZs to work correctly with snapshots
[ https://issues.apache.org/jira/browse/HDFS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138943#comment-14138943 ] Hudson commented on HDFS-7078: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) HDFS-7078. Fix listEZs to work correctly with snapshots. (wang) (wang: rev 0ecefe60179968984b1892a14411566b7a0c8df3) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix listEZs to work correctly with snapshots Key: HDFS-7078 URL: https://issues.apache.org/jira/browse/HDFS-7078 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7078.001.patch, hdfs-7078.002.patch listEZs will list encryption zones that are only present in a snapshot, rather than only the EZs in the current filesystem state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method
[ https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138952#comment-14138952 ] Hudson commented on HDFS-6843: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) HDFS-6843. Create FileStatus isEncrypted() method (clamb via cmccabe) (cmccabe: rev e3803d002c660f18a5c2ecf32344fd6f3f491a5b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsAclPermission.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/FsPermissionExtension.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java HDFS-6843. Add to CHANGES.txt (cmccabe: rev f24ac429d102777fe021e9852cfff38312643512) * hadoop-common-project/hadoop-common/CHANGES.txt Create FileStatus isEncrypted() method -- Key: HDFS-6843 URL: https://issues.apache.org/jira/browse/HDFS-6843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch, HDFS-6843.008.patch, HDFS-6843.009.patch, HDFS-6843.010.patch FileStatus should have a 'boolean isEncrypted()' method. (it was in the context of discussing with AndreW about FileStatus being a Writable). Having this method would allow MR JobSubmitter do the following: - BOOLEAN intermediateEncryption = false IF jobconf.contains(mr.intermidate.encryption) THEN intermediateEncryption = jobConf.getBoolean(mr.intermidate.encryption) ELSE IF (I/O)Format INSTANCEOF File(I/O)Format THEN intermediateEncryption = ANY File(I/O)Format HAS a Path with status isEncrypted()==TRUE FI jobConf.setBoolean(mr.intermidate.encryption, intermediateEncryption) FI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7078) Fix listEZs to work correctly with snapshots
[ https://issues.apache.org/jira/browse/HDFS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138956#comment-14138956 ] Hudson commented on HDFS-7078: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) HDFS-7078. Fix listEZs to work correctly with snapshots. (wang) (wang: rev 0ecefe60179968984b1892a14411566b7a0c8df3) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix listEZs to work correctly with snapshots Key: HDFS-7078 URL: https://issues.apache.org/jira/browse/HDFS-7078 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7078.001.patch, hdfs-7078.002.patch listEZs will list encryption zones that are only present in a snapshot, rather than only the EZs in the current filesystem state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7075) hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory
[ https://issues.apache.org/jira/browse/HDFS-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138953#comment-14138953 ] Hudson commented on HDFS-7075: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) HDFS-7075. hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory. (cmccabe) (cmccabe: rev f23024852502441fc259012664e444e5e51c604a) * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderFactory.java hadoop-fuse-dfs fails because it cannot find JavaKeyStoreProvider$Factory - Key: HDFS-7075 URL: https://issues.apache.org/jira/browse/HDFS-7075 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HDFS-7075.001.patch hadoop-fuse-dfs fails complaining with: {code} java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found {code} Here is an example of the hadoop-fuse-dfs debug output. {code} 14/09/04 13:49:04 WARN crypto.CryptoCodec: Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://hdfs-cdh5-secure-1.vpc.cloudera.com:8020, port=0, kerbTicketCachePath=/tmp/krb5cc_0, userName=root) error: java.util.ServiceConfigurationError: org.apache.hadoop.crypto.key.KeyProviderFactory: Provider org.apache.hadoop.crypto.key.JavaKeyStoreProvider$Factory not found at java.util.ServiceLoader.fail(ServiceLoader.java:231) at java.util.ServiceLoader.access$300(ServiceLoader.java:181) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:365) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138949#comment-14138949 ] Hudson commented on HDFS-6705: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) HDFS-6705. Create an XAttr that disallows the HDFS admin from accessing a file. (clamb via wang) (wang: rev ea4e2e843ecadd8019ea35413f4a34b97a424923) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrPermissionFilter.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testXAttrConf.xml * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSXAttrBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch, HDFS-6705.003.patch, HDFS-6705.004.patch, HDFS-6705.005.patch, HDFS-6705.006.patch, HDFS-6705.007.patch, HDFS-6705.008.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7004) Update KeyProvider instantiation to create by URI
[ https://issues.apache.org/jira/browse/HDFS-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138957#comment-14138957 ] Hudson commented on HDFS-7004: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) HDFS-7004. Update KeyProvider instantiation to create by URI. (wang) (wang: rev 10e8602f32b553a1424f1a9b5f9f74f7b68a49d1) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReservedRawPaths.java * hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/MiniKMS.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSConfiguration.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/cli/TestCryptoAdminCLI.java * hadoop-common-project/hadoop-kms/src/main/conf/kms-site.xml * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZonesWithHA.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSWebApp.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/site/apt/TransparentEncryption.apt.vm Update KeyProvider instantiation to create by URI - Key: HDFS-7004 URL: https://issues.apache.org/jira/browse/HDFS-7004 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.6.0 Attachments: hdfs-7004.001.patch, hdfs-7004.002.patch, hdfs-7004.004.patch See HADOOP-11054, would be good to update the NN/DFSClient to fetch via this method rather than depending on the URI path lookup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138973#comment-14138973 ] Yi Liu commented on HDFS-6970: -- This modification LGTM, thanks [~andrew.wang] Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138977#comment-14138977 ] Hadoop QA commented on HDFS-6584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669683/h6584_20140918b.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 28 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestFairCallQueue org.apache.hadoop.ipc.TestCallQueueManager org.apache.hadoop.crypto.random.TestOsSecureRandom org.apache.hadoop.hdfs.server.mover.TestStorageMover org.apache.hadoop.tracing.TestTracing org.apache.hadoop.hdfs.server.datanode.TestBPOfferService org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.TestHFlush org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8078//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8078//console This message is automatically generated. Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, h6584_20140918b.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139122#comment-14139122 ] Hadoop QA commented on HDFS-6584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669683/h6584_20140918b.patch against trunk revision ee21b13. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 28 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.mover.TestStorageMover org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8082//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8082//console This message is automatically generated. Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, h6584_20140918b.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7088) Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes
[ https://issues.apache.org/jira/browse/HDFS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139244#comment-14139244 ] Jing Zhao commented on HDFS-7088: - Thanks for the fix, [~szetszwo]! The patch looks good to me. I've also verified that all the balancer related tests passed with the patch. +1 Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Key: HDFS-7088 URL: https://issues.apache.org/jira/browse/HDFS-7088 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h7088_20140918.patch {noformat} java.lang.AssertionError: expected:0 but was:-3 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:163) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:365) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer(TestBalancerWithMultipleNameNodes.java:379) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6841) Use Time.monotonicNow() wherever applicable instead of Time.now()
[ https://issues.apache.org/jira/browse/HDFS-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6841: Attachment: HDFS-6841-003.patch Updated as per [~cmccabe] comments. Use Time.monotonicNow() wherever applicable instead of Time.now() - Key: HDFS-6841 URL: https://issues.apache.org/jira/browse/HDFS-6841 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6841-001.patch, HDFS-6841-002.patch, HDFS-6841-003.patch {{Time.now()}} used in many places to calculate elapsed time. This should be replaced with {{Time.monotonicNow()}} to avoid effect of System time changes on elapsed time calculations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6995) Block should be placed in the client's 'rack-local' node if 'client-local' node is not available
[ https://issues.apache.org/jira/browse/HDFS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6995: Attachment: HDFS-6995-003.patch Rebased the patch Block should be placed in the client's 'rack-local' node if 'client-local' node is not available Key: HDFS-6995 URL: https://issues.apache.org/jira/browse/HDFS-6995 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6995-001.patch, HDFS-6995-002.patch, HDFS-6995-003.patch HDFS cluster is rack aware. Client is in different node than of datanode, but Same rack contains one or more datanodes. In this case first preference should be given to select 'rack-local' node. Currently, since no Node in clusterMap corresponds to client's location, blockplacement policy choosing a *random* node as local node and proceeding for further placements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7088) Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes
[ https://issues.apache.org/jira/browse/HDFS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-7088. - Resolution: Fixed Hadoop Flags: Reviewed I've committed this. Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Key: HDFS-7088 URL: https://issues.apache.org/jira/browse/HDFS-7088 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h7088_20140918.patch {noformat} java.lang.AssertionError: expected:0 but was:-3 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:163) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:365) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer(TestBalancerWithMultipleNameNodes.java:379) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6581) Write to single replica in memory
[ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6581: Attachment: HDFS-6581.merge.04.patch Write to single replica in memory - Key: HDFS-6581 URL: https://issues.apache.org/jira/browse/HDFS-6581 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6581.merge.01.patch, HDFS-6581.merge.02.patch, HDFS-6581.merge.03.patch, HDFS-6581.merge.04.patch, HDFSWriteableReplicasInMemory.pdf, Test-Plan-for-HDFS-6581-Memory-Storage.pdf Per discussion with the community on HDFS-5851, we will implement writing to a single replica in DN memory via DataTransferProtocol. This avoids some of the issues with short-circuit writes, which we can revisit at a later time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7073) Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases.
[ https://issues.apache.org/jira/browse/HDFS-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139272#comment-14139272 ] Chris Nauroth commented on HDFS-7073: - bq. In the patch, fallback for writeblock is handled, but fallback for readblock is not handled. Yes, I spotted the same thing in my testing yesterday and chose to cancel the patch to make it clear that it's not ready. I'm working on a new patch. Thank you for your testing too. bq. The test case for this scenario is hard to write because UserGroupInformation#isSecurityEnabled() is static... Yes, agreed. Unfortunately, until we refactor some of the static stuff inside {{UserGroupInformation}}, it's going to be impossible to put tests covering these kinds of cross-cluster scenarios directly into the source tree. We're having to rely on external system tests to cover this. Last time I looked at refactoring {{UserGroupInformation}}, it looked like it was going to be a big effort, and possibly backwards-incompatible. bq. If we allow this type of fallback, as discussed in HDFS-2856 about the attack vector, a malicious task can easily listen on the DN's port after it dies and steal the block access token. So we'd better not allow the fallback? Thanks, great catch. The difficulty here is that {{ipc.client.fallback-to-simple-auth-allowed}} controls fallback globally regardless of which cluster the client is connecting to. One of the big use cases motivating fallback is distcp between a secure cluster and a non-secure cluster. In that scenario, setting {{ipc.client.fallback-to-simple-auth-allowed}} could accidentally trigger fallback during communication with the secured cluster, when we really only want it for the unsecured cluster. I'm going to explore an alternative implementation that detects if fallback actually occurred during the corresponding NameNode interaction before the DataTransferProtocol call. This would tell us unambiguously if the remote DataNode was unsecured. Doing this would require some additional plumbing at the RPC layer. Allow falling back to a non-SASL connection on DataTransferProtocol in several edge cases. -- Key: HDFS-7073 URL: https://issues.apache.org/jira/browse/HDFS-7073 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client, security Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-7073.1.patch HDFS-2856 implemented general SASL support on DataTransferProtocol. Part of that work also included a fallback mode in case the remote cluster is running under a different configuration without SASL. I've discovered a few edge case configurations that this did not support: * Cluster is unsecured, but has block access tokens enabled. This is not something I've seen done in practice, but I've heard historically it has been allowed. The HDFS-2856 code relied on seeing an empty block access token to trigger fallback, and this doesn't work if the unsecured cluster actually is using block access tokens. * The DataNode has an unpublicized testing configuration property that could be used to skip the privileged port check. However, the HDFS-2856 code is still enforcing requirement of SASL when the ports are not privileged, so this would force existing configurations to make changes to activate SASL. This patch will restore the old behavior so that these edge case configurations will continue to work the same way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7084) FsDatasetImpl#copyBlockFiles debug log can be improved
[ https://issues.apache.org/jira/browse/HDFS-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-7084. - Resolution: Fixed Fix Version/s: HDFS-6581 Hadoop Flags: Reviewed +1, committed to the feature branch. Thanks Xiaoyu! The old log message was incorrect. FsDatasetImpl#copyBlockFiles debug log can be improved -- Key: HDFS-7084 URL: https://issues.apache.org/jira/browse/HDFS-7084 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Priority: Minor Fix For: HDFS-6581 Attachments: HDFS-7084.0.patch addBlock: Moved should be replaced with Copied or lazyPersistReplica : Copied to avoid confusion. {code} static File[] copyBlockFiles(long blockId, long genStamp, File srcMeta, File srcFile, File destRoot) { ... if (LOG.isDebugEnabled()) { LOG.debug(addBlock: Moved + srcMeta + to + dstMeta); LOG.debug(addBlock: Moved + srcFile + to + dstFile); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6584) Support Archival Storage
[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139273#comment-14139273 ] Jing Zhao commented on HDFS-6584: - Failures of TestEncryptionZonesWithKMS, TestWebHdfsFileSystemContract, and TestPipelinesFailover are also seen in other Jenkins run and should be unrelated. Failure of TestOfflineEditsViewer is expected since we need to update the editsStored binary file. Failure of TestStorageMover cannot be reproduced in my local machine (I run the test 100 times but still could not reproduce the failure). Maybe it's related to the Jenkins environment. We can track it in a separate jira. I think the feature is ready to be merged into trunk once the vote is closed. [~szetszwo], can you close the vote in the dev mailing list? Support Archival Storage Key: HDFS-6584 URL: https://issues.apache.org/jira/browse/HDFS-6584 Project: Hadoop HDFS Issue Type: New Feature Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: HDFS-6584.000.patch, HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, h6584_20140918b.patch In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is: - Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually. - Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster. Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.007.patch Update the patch to address findbugs warnings. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.007.combo.patch Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7046) HA NN can NPE upon transition to active
[ https://issues.apache.org/jira/browse/HDFS-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139307#comment-14139307 ] Aaron T. Myers commented on HDFS-7046: -- I agree with Kihwal and Daryn that the benefit of starting the process of leaving safemode while edits are still being processed seems negligible, so it's better to be safe here and just wait for the transition to active to complete. In a steady state cluster it's very unlikely for the standby to be in safemode anyway, since the NN will not enter safemode on its own except immediately after startup, and there's little or no reason for the admin to ever put the standby in safemode anyway. +1, the patch makes sense to me. I agree that it would be pretty difficult to write a test for this case, and now that the issue is pointed out the fix is quite straightforward, so I'm OK committing this without a test. HA NN can NPE upon transition to active --- Key: HDFS-7046 URL: https://issues.apache.org/jira/browse/HDFS-7046 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.5.0 Reporter: Daryn Sharp Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7046.patch, HDFS-7046_test_reproduce.patch While processing edits, the NN may decide after adjusting block totals to leave safe mode - in the middle of the edit. Going active starts the secret manager which generates a new secret key, which in turn generates an edit, which NPEs because the edit log is not open. # Transitions should _not_ occur in the middle of an edit. # The edit log appears to claim it's open for write when the stream isn't even open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139311#comment-14139311 ] Colin Patrick McCabe commented on HDFS-6727: Thanks for addressing my comments about addVolumes. Can you add a comment to the declaration of DataNode#dataDirs explaining that it must be accessed while holding the DataNode lock? {code} void recoverTransitionRead(DataNode datanode, String bpID, NamespaceInfo nsInfo, - CollectionStorageLocation dataDirs, StartupOption startOpt) throws IOException { + final CollectionStorageLocation dataDirs, StartupOption startOpt) throws IOException { {code} It seems like the patch would be smaller without this... In this comment: {code} + * It should only be used for deactivating disks. {code} I think It should only be used when deactivating disks would be clearer. This method doesn't itself deactivate the disk... it's just used when deactivating disks. {code} +// If IOException raises from FsVolumeImpl() or getVolumeMap(), there is +// nothing needed to be rolled back to make various data structures, e.g., +// storageMap and asyncDiskService, consistent. +final FsVolumeImpl fsVolume = new FsVolumeImpl( +this, sd.getStorageUuid(), dir, this.conf, storageType); +final ReplicaMap tempVolumeMap = new ReplicaMap(fsVolume); + +ListIOException exceptions = Lists.newArrayList(); +for (final String bpid : bpids) { + try { +fsVolume.addBlockPool(bpid, this.conf); +fsVolume.getVolumeMap(bpid, volumeMap); + } catch (IOException e) { {code} I like the idea behind this comment, but maybe putting it inside the catch block would make things clearer. Also, maybe it could be shortened to something like no rollback is needed here +1 once those changes are addressed. Thanks, Eddy. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.004.patch, HDFS-6727.005.patch, HDFS-6727.006.patch, HDFS-6727.006.patch, HDFS-6727.007.patch, HDFS-6727.combo.patch, patchFindBugsOutputhadoop-hdfs.txt HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7047) Expose FileStatus#isEncrypted in libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7047: --- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) Expose FileStatus#isEncrypted in libhdfs Key: HDFS-7047 URL: https://issues.apache.org/jira/browse/HDFS-7047 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HDFS-7047.001.patch, HDFS-7047.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139331#comment-14139331 ] Colin Patrick McCabe commented on HDFS-6970: This is a nice simplification. In {code} // Flip-flop between two EZs to repeatedly fail -for (int i=0; i10; i++) { +for (int i=0; i11; i++) { injector.ready.await(); {code} Can you put this 10 as a constant in DFSOutputStream, VisibleForTesting? +1 once that's addressed Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139346#comment-14139346 ] Charles Lamb commented on HDFS-6970: LGTM. Nits only. DFSOutputStream.java newStreamForCreate, buffersize arg is no longer used so perhaps mark it as such either with a comment or by renaming to your favorite version of ignore. FSNamesystem.java Seems like there was some whitespace introduced. RetryStartFileException IntelliJ says the second ctor is unused. Is it there for posterity? TestEncryptionZones#testStartFileRetry - at first blush this seems to fix the timeout problems we've been seeing. Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6581) Write to single replica in memory
[ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139352#comment-14139352 ] Hadoop QA commented on HDFS-6581: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669765/HDFS-6581.merge.04.patch against trunk revision 485c96e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 29 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 112 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/8085//artifact/trunk/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.crypto.random.TestOsSecureRandom org.apache.hadoop.ipc.TestCallQueueManager org.apache.hadoop.ipc.TestFairCallQueue The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8085//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8085//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8085//console This message is automatically generated. Write to single replica in memory - Key: HDFS-6581 URL: https://issues.apache.org/jira/browse/HDFS-6581 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6581.merge.01.patch, HDFS-6581.merge.02.patch, HDFS-6581.merge.03.patch, HDFS-6581.merge.04.patch, HDFSWriteableReplicasInMemory.pdf, Test-Plan-for-HDFS-6581-Memory-Storage.pdf Per discussion with the community on HDFS-5851, we will implement writing to a single replica in DN memory via DataTransferProtocol. This avoids some of the issues with short-circuit writes, which we can revisit at a later time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139386#comment-14139386 ] Colin Patrick McCabe commented on HDFS-6808: ReconfigurableBase: you have a bunch of comments in here that would be better as JavaDoc. For example: {code} // The timestamp when the codereconfigThread/code starts. private long startTime = 0; // The timestamp when the codereconfigThread/code finishes. private long endTime = 0; {code} instead you could have: {code} /** * The timestamp when the codereconfigThread/code starts. */ private long startTime = 0; /** * The timestamp when the codereconfigThread/code finishes. */ private long endTime = 0; {code} {{ClientDatanodeProtocol}}: this looks good overall. I wonder if StartReconfigurationRequestProto, etc. would be better than StartReconfigureRequestProto? {code} message StartReconfigureRequestProto { } message StartReconfigureResponseProto { enum StartReconfigureResult { SUCCESS = 0; SERVER_STOPPED = 1; EXISTED = 2; } required StartReconfigureResult result = 1; } {code} I guess this is a matter of style, but I don't think you need this enum. Just throw an exception (handled specially by our RPC system) when the server is stopped or when there is already an ongoing reconfiguration. If you throw an IOException, it will make it through to the other side. {code} /** * Start a reconfiguration task to reload configuration in background. */ public StartReconfigureResult startReconfigureTask() { synchronized (this) { if (!shouldRun) { LOG.warn(The server is stopping.); return StartReconfigureResult.SERVER_STOPPED; } if (reconfigThread != null) { LOG.warn(Another reconfigure task is running.); return StartReconfigureResult.EXISTED; } reconfigThread = new ReconfigureThread(this); reconfigThread.start(); startTime = Time.monotonicNow(); } return StartReconfigureResult.SUCCESS; } {code} Similar to the protobuf code, this could simply throw IOException rather than using an enum. Also, since it's pretty much all synchronized (except the return statement?) it could just be a synchronized method. In {{ClientDatanodeProtocol.java}}: {code} /** * Asynchronously reload configuration on disk and apply changes. */ StartReconfigureResult startReconfigure() throws IOException; {code} Similar to the above, this could just return void. (If there is already a reconfiguration in progress, we can throw an IOE.) {code} message GetReconfigureStatusResultProto { required string name = 1; required string oldValue = 2; required string newValue = 3; optional string errorMessage = 4; // It is empty if success. } message GetReconfigureStatusResponseProto { required int64 startTime = 1; optional int64 endTime = 2; repeated GetReconfigureStatusResultProto status = 3; } {code} {{GetReconfigureStatusResultProto}} is kind of a confusing name. This isn't really a result, it's a configuration key that we're changing. How about calling it {{GetReconfigurationStatusConfigChangeProto}}? Also, it seems like {{newValue}} should be marked {{optional}} to fit in with the idea that a configuration key could be removed (that's why you use {{Optional}} in other places, right?) In {{ReconfigurableBase#ReconfigureTaskStatus}}, we have: {code} public final MapPropertyChange, OptionalString getStatus() { return status; } {code} This is a little concerning since this map is mutable... the caller could theoretically modify this, causing chaos. Can we wrap this in an {{ImmutableCollection}} or {{ImmutableMap}} to prevent this? Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by
[jira] [Created] (HDFS-7089) Fix findbugs and release audit warnings in the branch
Arpit Agarwal created HDFS-7089: --- Summary: Fix findbugs and release audit warnings in the branch Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7089) Fix findbugs and release audit warnings in the branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7089 started by Arpit Agarwal. --- Fix findbugs and release audit warnings in the branch - Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7089) Fix findbugs and release audit warnings in the HDFS-6581 branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7089: Summary: Fix findbugs and release audit warnings in the HDFS-6581 branch (was: Fix findbugs and release audit warnings in the branch) Fix findbugs and release audit warnings in the HDFS-6581 branch --- Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7053) Failed to rollback hdfs version from 2.4.1 to 2.2.0
[ https://issues.apache.org/jira/browse/HDFS-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139422#comment-14139422 ] Jing Zhao commented on HDFS-7053: - I guess you're hitting HDFS-5526? Failed to rollback hdfs version from 2.4.1 to 2.2.0 --- Key: HDFS-7053 URL: https://issues.apache.org/jira/browse/HDFS-7053 Project: Hadoop HDFS Issue Type: Bug Components: ha, namenode Affects Versions: 2.4.1 Reporter: sam liu Priority: Blocker I can successfully upgrade from 2.2.0 to 2.4.1 with QJM HA enabled and with downtime, but failed to rollback from 2.4.1 to 2.2.0. The error message: 2014-09-10 16:50:29,599 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join org.apache.hadoop.HadoopIllegalArgumentException: Invalid startup option. Cannot perform DFS upgrade with HA enabled. at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1207) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320) 2014-09-10 16:50:29,601 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6727: Attachment: HDFS-6727.008.patch [~cmccabe] I've update the patch to address your comments. Would you mind take another look. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.004.patch, HDFS-6727.005.patch, HDFS-6727.006.patch, HDFS-6727.006.patch, HDFS-6727.007.patch, HDFS-6727.008.patch, HDFS-6727.combo.patch, patchFindBugsOutputhadoop-hdfs.txt HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6947: --- Component/s: documentation Description: Minor changes to the HAR documentation need to be made discussing HAR and encryption. Priority: Minor (was: Major) Target Version/s: 2.6.0 (was: 3.0.0) Summary: Document considerations of HAR and Encryption (was: Enhance HAR integration with encryption zones) Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139457#comment-14139457 ] Colin Patrick McCabe commented on HDFS-6727: +1 pending jenkins Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.004.patch, HDFS-6727.005.patch, HDFS-6727.006.patch, HDFS-6727.006.patch, HDFS-6727.007.patch, HDFS-6727.008.patch, HDFS-6727.combo.patch, patchFindBugsOutputhadoop-hdfs.txt HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6841) Use Time.monotonicNow() wherever applicable instead of Time.now()
[ https://issues.apache.org/jira/browse/HDFS-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139483#comment-14139483 ] Hadoop QA commented on HDFS-6841: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669759/HDFS-6841-003.patch against trunk revision a3d9934. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestRollingUpgrade {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8083//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8083//console This message is automatically generated. Use Time.monotonicNow() wherever applicable instead of Time.now() - Key: HDFS-6841 URL: https://issues.apache.org/jira/browse/HDFS-6841 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6841-001.patch, HDFS-6841-002.patch, HDFS-6841-003.patch {{Time.now()}} used in many places to calculate elapsed time. This should be replaced with {{Time.monotonicNow()}} to avoid effect of System time changes on elapsed time calculations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6947: --- Status: Patch Available (was: Open) Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6947 started by Charles Lamb. -- Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work stopped] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6947 stopped by Charles Lamb. -- Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6947: --- Attachment: HDFS-6947.001.patch The .001 patch has the doc change. Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6970: -- Attachment: hdfs-6970.002.patch Thanks for reviewing! New patch attached, breaks the {{10}} out into a constant like Colin recommended. Charlie, buffersize wasn't touched in this patch, so I'm not sure if we should change it here. The fact that it's being ignored might be a separate bug. I also don't see any unnecessary whitespace introduced in FSN. The constructor in RetryStartFileException is also required for the unwrapping to work properly with RemoteException. Finally, I'm not sure if this will fix the sporadic test timeouts in testStartFileRetry, but there's the possibility that it might. Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch, hdfs-6970.002.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139535#comment-14139535 ] Andrew Wang commented on HDFS-6947: --- +1 pending Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139540#comment-14139540 ] Hadoop QA commented on HDFS-6808: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669769/HDFS-6808.007.combo.patch against trunk revision 485c96e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover The test build failed in hadoop-common-project/hadoop-common {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8086//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8086//console This message is automatically generated. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6987) Move CipherSuite xattr information up to the encryption zone root
[ https://issues.apache.org/jira/browse/HDFS-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139556#comment-14139556 ] Charles Lamb commented on HDFS-6987: Hi Zhe, I took a quick look and have some trivial comments: Several lines bust the 80 char limit. EncryptionZoneManager.java createEncryptionZone adds an extra newline. FSDirectory.java Should we be using NameNode.LOG instead of FSNamesystem.LOG? NameNode.LOG seems to be the norm in this file. In setFileEncryptionInfo there's a blank line you introduced which probably doesn't enhance the readability. To Andrew's point about Ind, I agree that it's ambiguous. But looking at the .proto file, it looks like you mean Individual rather than INode. If that's the case, then Indiv or Individ might be better? Personally, I like to use final a lot, but that's my own hobby horse. Move CipherSuite xattr information up to the encryption zone root - Key: HDFS-6987 URL: https://issues.apache.org/jira/browse/HDFS-6987 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Andrew Wang Assignee: Zhe Zhang Attachments: HDFS-6987-20140917-v1.patch All files within a single EZ need to be encrypted with the same CipherSuite. Because of this, I think we can store the CipherSuite once in the EZ rather than on each file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6970) Move startFile EDEK retries to the DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139563#comment-14139563 ] Charles Lamb commented on HDFS-6970: Oh, you're right about buffersize. I'm so used to keying in on the magenta for unused that I failed to notice that it wasn't a change. So, yes, I agree it shouldn't be addressed. The whitespace I was thinking of is at line 2482, but in retrospect it improves readability, so NM. Thanks for the clarification on RetryStartFileException. That makes sense. In terms of testStartFileRetry, I'm optimistic. I used to be able to reproduce the hang easily. Now I can't. Anyway, I like this client-side approach a lot better so thanks for working on this. +1, non-binding. Move startFile EDEK retries to the DFSClient Key: HDFS-6970 URL: https://issues.apache.org/jira/browse/HDFS-6970 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6970.001.patch, hdfs-6970.002.patch [~sureshms] pointed out that holding on to an RPC handler while talking to the KMS is bad, since it can exhaust the available handlers. Let's avoid this by doing retries at the DFSClient rather than in the RPC handler, and moving EDEK fetching to the background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139566#comment-14139566 ] Hadoop QA commented on HDFS-6947: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669805/HDFS-6947.001.patch against trunk revision 1cf3198. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8088//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8088//console This message is automatically generated. Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6840) Clients are always sent to the same datanode when read is off rack
[ https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6840: -- Attachment: hdfs-6840.003.patch Sorry for the delay in revving this. New patch removes the stale comment as per Jason's feedback. Daryn, do you mind if we fix any seed issues in a separate JIRA? I think we depend on this behavior in other places too, so if/when said JDK change does hit, we could address them all at once. Clients are always sent to the same datanode when read is off rack -- Key: HDFS-6840 URL: https://issues.apache.org/jira/browse/HDFS-6840 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Jason Lowe Assignee: Andrew Wang Priority: Critical Attachments: hdfs-6840.001.patch, hdfs-6840.002.patch, hdfs-6840.003.patch After HDFS-6268 the sorting order of block locations is deterministic for a given block and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode for the same block. This leads to very poor behavior in distributed cache localization and other scenarios where many clients all want the same block data at approximately the same time. The one datanode is crushed by the load while the other replicas only handle local and rack-local requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-6815) Verify that alternate access methods work properly with Data at Rest Encryption
[ https://issues.apache.org/jira/browse/HDFS-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang resolved HDFS-6815. --- Resolution: Done We've gone through the various other HDFS access methods at this point, and have JIRAs filed for anything specific that still needs to be fixed. Resolving this. Verify that alternate access methods work properly with Data at Rest Encryption --- Key: HDFS-6815 URL: https://issues.apache.org/jira/browse/HDFS-6815 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Verify that alternative access methods (libhdfs, Httpfs, nfsv3) work properly with Data at Rest Encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6947) Document considerations of HAR and Encryption
[ https://issues.apache.org/jira/browse/HDFS-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6947: -- Issue Type: Improvement (was: Sub-task) Parent: (was: HDFS-6891) Document considerations of HAR and Encryption - Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Charles Lamb Priority: Minor Attachments: HDFS-6947.001.patch Minor changes to the HAR documentation need to be made discussing HAR and encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7003) Add NFS Gateway support for reading and writing to encryption zones
[ https://issues.apache.org/jira/browse/HDFS-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139591#comment-14139591 ] Charles Lamb commented on HDFS-7003: I ran the three tests that failed in jenkins and they all passed locally for me. Thanks for the review Andrew. Add NFS Gateway support for reading and writing to encryption zones --- Key: HDFS-7003 URL: https://issues.apache.org/jira/browse/HDFS-7003 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, nfs Affects Versions: 2.6.0 Reporter: Stephen Chu Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-7003.001.patch, HDFS-7003.002.patch, HDFS-7003.003.patch Currently, reading and writing within encryption zones does not work through the NFS gateway. For example, we have an encryption zone {{/enc}}. Here's the difference of reading the file from hadoop fs and the NFS gateway: {code} [hdfs@schu-enc2 ~]$ hadoop fs -cat /enc/hi hi [hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hi ?? {code} If we write a file using the NFS gateway, we'll see behavior like this: {code} [hdfs@schu-enc2 ~]$ echo hello /hdfs_nfs/enc/hello [hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hello hello [hdfs@schu-enc2 ~]$ hdfs dfs -cat /enc/hello ???tp[hdfs@schu-enc2 ~]$ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7003) Add NFS Gateway support for reading and writing to encryption zones
[ https://issues.apache.org/jira/browse/HDFS-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7003: -- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) I ran these locally and they passed. Committed to trunk and branch-2, thanks Charles. Add NFS Gateway support for reading and writing to encryption zones --- Key: HDFS-7003 URL: https://issues.apache.org/jira/browse/HDFS-7003 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, nfs Affects Versions: 2.6.0 Reporter: Stephen Chu Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-7003.001.patch, HDFS-7003.002.patch, HDFS-7003.003.patch Currently, reading and writing within encryption zones does not work through the NFS gateway. For example, we have an encryption zone {{/enc}}. Here's the difference of reading the file from hadoop fs and the NFS gateway: {code} [hdfs@schu-enc2 ~]$ hadoop fs -cat /enc/hi hi [hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hi ?? {code} If we write a file using the NFS gateway, we'll see behavior like this: {code} [hdfs@schu-enc2 ~]$ echo hello /hdfs_nfs/enc/hello [hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hello hello [hdfs@schu-enc2 ~]$ hdfs dfs -cat /enc/hello ???tp[hdfs@schu-enc2 ~]$ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.008.combo.patch Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, HDFS-6808.008.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.008.patch Update patch to address [~cmccabe]'s comments. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, HDFS-6808.008.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7049) TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2
[ https://issues.apache.org/jira/browse/HDFS-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139613#comment-14139613 ] Eric Payne commented on HDFS-7049: -- Hi [~j...@cloudera.com]. Thanks for merging this fix and creating the patch. The patch doesn't apply because it has the 'a/' and 'b/' at the beginning of the filepaths. I downloaded the patch, removed those strings from the filepaths, and I was able to apply the patch cleanly to branch-2. The test also passes cleanly with no NPE. Once you make that change to the patch, it looks good to me. TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2 Key: HDFS-7049 URL: https://issues.apache.org/jira/browse/HDFS-7049 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor Attachments: HDFS-7049-branch-2.patch On branch-2, TestByteRangeInputStream.testPropagatedClose throw NPE when HftpFileSystem$RangeHeaderUrlOpener.connect This is due to fix of HDFS-6143 WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths public ByteRangeInputStream(URLOpener o, URLOpener r) throws IOException { this.originalURL = o; this.resolvedURL = r; getInputStream(); } the getInputStream() will be called in constructor now to verify if file exists. Since we just try to test if ByteRangeInputStream#close is called at proper time, we could mock(ByteRangeInputStream.class, CALLS_REAL_METHODS) for testing to avoid the NPE issue. I believe the trunk version already does this, we just need to merge the test from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7049) TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2
[ https://issues.apache.org/jira/browse/HDFS-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139615#comment-14139615 ] Eric Payne commented on HDFS-7049: -- Sorry, I forgot to mention that in order to avoid the 'a/' and 'b/' prefix problem, you can use {{git diff --no-prefix}} when creating the patch. TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2 Key: HDFS-7049 URL: https://issues.apache.org/jira/browse/HDFS-7049 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor Attachments: HDFS-7049-branch-2.patch On branch-2, TestByteRangeInputStream.testPropagatedClose throw NPE when HftpFileSystem$RangeHeaderUrlOpener.connect This is due to fix of HDFS-6143 WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths public ByteRangeInputStream(URLOpener o, URLOpener r) throws IOException { this.originalURL = o; this.resolvedURL = r; getInputStream(); } the getInputStream() will be called in constructor now to verify if file exists. Since we just try to test if ByteRangeInputStream#close is called at proper time, we could mock(ByteRangeInputStream.class, CALLS_REAL_METHODS) for testing to avoid the NPE issue. I believe the trunk version already does this, we just need to merge the test from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139628#comment-14139628 ] Colin Patrick McCabe commented on HDFS-6808: Looks good overall. {code} // A map of changed property, error message. If error message is present, // it contains the messages about the error occurred when applies the particular // change. Otherwise, it indicates that the change has been successfully applied. private MapPropertyChange, OptionalString status = null; {code} This still needs to be JavaDoc'ed. Similar with reconfigThread. ReconfigurationThread needs to call {{setDaemon}} and also set its name for jstack purposes. {code} /** * Asynchronously reload configuration on disk and apply changes. */ void startReconfigure() throws IOException; {code} Rename to {{startReconfiguration}}? {code} /** * Get the status of the previously issued reconfig task. * @see {@link org.apache.hadoop.conf.ReconfigurableBase.ReconfigurationTaskStatus}. */ ReconfigurableBase.ReconfigurationTaskStatus getReconfigureStatus() throws IOException; {code} Can you make {{ReconfigurationTaskStatus}} a top-level class? Normally return values from RPCs are either top-level classes, or static inner classes defined in the interface file itself. {{DFSAdmin.java}}: does this print anything when starting a reconfiguration? It would be nice to print something like Started reconfiguration on NameNode 127.0.0.1. {code} message GetReconfigurationStatusConfigChangeProto { required string name = 1; {code} How about calling this key to be more consistent with our other config stuff? Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, HDFS-6808.008.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack
[ https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139643#comment-14139643 ] Aaron T. Myers commented on HDFS-6840: -- Latest patch looks good to me, +1. I agree that we can reasonably move the improvements to the tests to make them deterministic to another JIRA. Andrew, could you please go ahead and file that? Clients are always sent to the same datanode when read is off rack -- Key: HDFS-6840 URL: https://issues.apache.org/jira/browse/HDFS-6840 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Jason Lowe Assignee: Andrew Wang Priority: Critical Attachments: hdfs-6840.001.patch, hdfs-6840.002.patch, hdfs-6840.003.patch After HDFS-6268 the sorting order of block locations is deterministic for a given block and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode for the same block. This leads to very poor behavior in distributed cache localization and other scenarios where many clients all want the same block data at approximately the same time. The one datanode is crushed by the load while the other replicas only handle local and rack-local requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7089) Fix findbugs warnings in the HDFS-6581 branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7089: Summary: Fix findbugs warnings in the HDFS-6581 branch (was: Fix findbugs in the HDFS-6581 branch) Fix findbugs warnings in the HDFS-6581 branch - Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7089) Fix findbugs in the HDFS-6581 branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7089: Summary: Fix findbugs in the HDFS-6581 branch (was: Fix findbugs and release audit warnings in the HDFS-6581 branch) Fix findbugs in the HDFS-6581 branch Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7089) Fix findbugs warnings in the HDFS-6581 branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7089: Attachment: HDFS-7089.01.patch Fix findbugs warnings in the HDFS-6581 branch - Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-7089.01.patch The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7089) Fix findbugs warnings in the HDFS-6581 branch
[ https://issues.apache.org/jira/browse/HDFS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139662#comment-14139662 ] Arpit Agarwal commented on HDFS-7089: - Patch to fix the findbugs warnings. The Release Audit warning can be ignored. There was a leftover CHANGES file in the merge patch. Fix findbugs warnings in the HDFS-6581 branch - Key: HDFS-7089 URL: https://issues.apache.org/jira/browse/HDFS-7089 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-7089.01.patch The latest Jenkins run flagged some Findbugs and Release Audit warnings. https://builds.apache.org/job/PreCommit-HDFS-Build/8085// -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack
[ https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139661#comment-14139661 ] Andrew Wang commented on HDFS-6840: --- Filed HADOOP-11107 as a follow-on for the random issue, thanks for reviewing ATM. Clients are always sent to the same datanode when read is off rack -- Key: HDFS-6840 URL: https://issues.apache.org/jira/browse/HDFS-6840 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Jason Lowe Assignee: Andrew Wang Priority: Critical Attachments: hdfs-6840.001.patch, hdfs-6840.002.patch, hdfs-6840.003.patch After HDFS-6268 the sorting order of block locations is deterministic for a given block and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode for the same block. This leads to very poor behavior in distributed cache localization and other scenarios where many clients all want the same block data at approximately the same time. The one datanode is crushed by the load while the other replicas only handle local and rack-local requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.009.patch Hey [~cmccabe] Thanks for your quick response. I've changed the patch based on the most of your comments. bq. How about calling this key to be more consistent with our other config stuff? I think name might be better since it is used in the xml configuration files and aligned with the existing reconfiguration framework (i.e., in {{PropertyChange}}). Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, HDFS-6808.008.patch, HDFS-6808.009.combo.patch, HDFS-6808.009.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.009.combo.patch Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, HDFS-6808.008.patch, HDFS-6808.009.combo.patch, HDFS-6808.009.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6581) Write to single replica in memory
[ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6581: Attachment: HDFS-6581.merge.05.patch Write to single replica in memory - Key: HDFS-6581 URL: https://issues.apache.org/jira/browse/HDFS-6581 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6581.merge.01.patch, HDFS-6581.merge.02.patch, HDFS-6581.merge.03.patch, HDFS-6581.merge.04.patch, HDFS-6581.merge.05.patch, HDFSWriteableReplicasInMemory.pdf, Test-Plan-for-HDFS-6581-Memory-Storage.pdf Per discussion with the community on HDFS-5851, we will implement writing to a single replica in DN memory via DataTransferProtocol. This avoids some of the issues with short-circuit writes, which we can revisit at a later time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas
Arpit Agarwal created HDFS-7090: --- Summary: Use unbuffered writes when persisting in-memory replicas Key: HDFS-7090 URL: https://issues.apache.org/jira/browse/HDFS-7090 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to persistent storage. It would be better to use unbuffered writes to avoid churning page cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)