[jira] [Commented] (HDFS-6814) Mistakenly dfs.namenode.list.encryption.zones.num.responses configured as boolean
[ https://issues.apache.org/jira/browse/HDFS-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084349#comment-14084349 ] Vinayakumar B commented on HDFS-6814: - +1, lgtm Mistakenly dfs.namenode.list.encryption.zones.num.responses configured as boolean - Key: HDFS-6814 URL: https://issues.apache.org/jira/browse/HDFS-6814 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-6814.patch {code} property namedfs.namenode.list.encryption.zones.num.responses/name valuefalse/value descriptionWhen listing encryption zones, the maximum number of zones that will be returned in a batch. Fetching the list incrementally in batches improves namenode performance. /description /property {code} default value should be 100. Should be same as {code}public static final int DFS_NAMENODE_LIST_ENCRYPTION_ZONES_NUM_RESPONSES_DEFAULT = 100;{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084364#comment-14084364 ] Vinayakumar B commented on HDFS-5723: - Thanks Uma for the review. I will commit this soon Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-5723: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084371#comment-14084371 ] Vinayakumar B commented on HDFS-5723: - Committed to trunk and branch-2. Thanks [~umamaheswararao], [~jingzhao] and [~stanley_shi] Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084372#comment-14084372 ] Vinayakumar B commented on HDFS-6247: - Hi [~umamaheswararao], Can you please take a look at the patch if possible.. ? thanks Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084375#comment-14084375 ] Hudson commented on HDFS-5723: -- FAILURE: Integrated in Hadoop-trunk-Commit #6005 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6005/]) HDFS-5723. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615491) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084398#comment-14084398 ] Hadoop QA commented on HDFS-5185: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659611/HDFS-5185-003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7552//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7552//console This message is automatically generated. DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084433#comment-14084433 ] Vinayakumar B commented on HDFS-5185: - Test failure is related to HDFS-6694. Will commit this soon DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-5185: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks [~umamaheswararao] for the review DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1408#comment-1408 ] Hudson commented on HDFS-5185: -- FAILURE: Integrated in Hadoop-trunk-Commit #6006 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6006/]) HDFS-5185. DN fails to startup if one of the data dir is full. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615504) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6814) Mistakenly dfs.namenode.list.encryption.zones.num.responses configured as boolean
[ https://issues.apache.org/jira/browse/HDFS-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G resolved HDFS-6814. --- Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Hadoop Flags: Reviewed I have committed this to branch. Mistakenly dfs.namenode.list.encryption.zones.num.responses configured as boolean - Key: HDFS-6814 URL: https://issues.apache.org/jira/browse/HDFS-6814 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6814.patch {code} property namedfs.namenode.list.encryption.zones.num.responses/name valuefalse/value descriptionWhen listing encryption zones, the maximum number of zones that will be returned in a batch. Fetching the list incrementally in batches improves namenode performance. /description /property {code} default value should be 100. Should be same as {code}public static final int DFS_NAMENODE_LIST_ENCRYPTION_ZONES_NUM_RESPONSES_DEFAULT = 100;{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084548#comment-14084548 ] Hudson commented on HDFS-5723: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #633 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/633/]) HDFS-5723. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615491) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084549#comment-14084549 ] Hudson commented on HDFS-5185: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #633 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/633/]) HDFS-5185. DN fails to startup if one of the data dir is full. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615504) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6811) Fix bad default value for dfs.namenode.list.encryption.zones.num.responses
[ https://issues.apache.org/jira/browse/HDFS-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084579#comment-14084579 ] Uma Maheswara Rao G commented on HDFS-6811: --- Oh, I committed this fix in HDFS-6814. Sorry did not notice this JIRA first :(. I think I missed because, I searched in subtask whether any jira for it. Closing it as Dupe with HDFS-6814. Fix bad default value for dfs.namenode.list.encryption.zones.num.responses -- Key: HDFS-6811 URL: https://issues.apache.org/jira/browse/HDFS-6811 Project: Hadoop HDFS Issue Type: Bug Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6811.001.patch This came from a typo in HDFS-6780, false is not a valid integer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6811) Fix bad default value for dfs.namenode.list.encryption.zones.num.responses
[ https://issues.apache.org/jira/browse/HDFS-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G resolved HDFS-6811. --- Resolution: Duplicate Fix bad default value for dfs.namenode.list.encryption.zones.num.responses -- Key: HDFS-6811 URL: https://issues.apache.org/jira/browse/HDFS-6811 Project: Hadoop HDFS Issue Type: Bug Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6811.001.patch This came from a typo in HDFS-6780, false is not a valid integer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084664#comment-14084664 ] Hudson commented on HDFS-5723: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1827 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1827/]) HDFS-5723. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615491) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084665#comment-14084665 ] Hudson commented on HDFS-5185: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1827 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1827/]) HDFS-5185. DN fails to startup if one of the data dir is full. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615504) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084688#comment-14084688 ] Tsz Wo Nicholas Sze commented on HDFS-6813: --- The javadoc in PositionedReadable may be out dated. Even DFSInputStream does not seem to be thread safe. WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe. --- Key: HDFS-6813 URL: https://issues.apache.org/jira/browse/HDFS-6813 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6813.001.patch {{PositionedReadable}} definition requires the implementations for its interfaces should be thread-safe. OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084056#comment-14084056 ] Uma Maheswara Rao G edited comment on HDFS-6813 at 8/4/14 2:18 PM: --- I think from PositionedReadable doc, this looks reasonable to me. But I also noticed FsInputStream also have the APIs with out synchronized. Also the read api in DFSInputStream also not synchronized, but not sure that was left without synchronization intentionally. [~szetszwo], can you please confirm if there is any reason for not synchronized and did not follow the PositionedReadable java doc? Thanks was (Author: umamaheswararao): I think from PositionedReadable doc, this looks reasonable to me. But I also noticed FsInputStream also have the APIs with out synchronized. Also the read api in DFSInputStream also not synchronized, but sure that was left with synchronization with intention. [~szetszwo], can you please confirm if there is any reason for not synchronized and did not follow the PositionedReadable java doc? Thanks WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe. --- Key: HDFS-6813 URL: https://issues.apache.org/jira/browse/HDFS-6813 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6813.001.patch {{PositionedReadable}} definition requires the implementations for its interfaces should be thread-safe. OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084738#comment-14084738 ] Hudson commented on HDFS-5723: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1852 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1852/]) HDFS-5723. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615491) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5185) DN fails to startup if one of the data dir is full
[ https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084739#comment-14084739 ] Hudson commented on HDFS-5185: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1852 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1852/]) HDFS-5185. DN fails to startup if one of the data dir is full. Contributed by Vinayakumar B. (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615504) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java DN fails to startup if one of the data dir is full -- Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-5185-002.patch, HDFS-5185-003.patch, HDFS-5185.patch DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084734#comment-14084734 ] Yi Liu commented on HDFS-6813: -- Hi [~szetszwo], thanks for your comments. I think the original intention in {{DFSInputStream}} is to make thread-safe for PositionedReadable implementation, we can see there are *synchronized* for other methods; normal read is synchronized, and there is no reason that positioned read is not synchronized (few object variables are used in this method, maybe forget to synchronize?). I have checked other inputstreams in Hadoop which implement PositionedReadable almost are thread-safe. If we don't want thread-safe, it's better to remove synchronized, then it's a bit more efficient. What's your suggestion about this? WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe. --- Key: HDFS-6813 URL: https://issues.apache.org/jira/browse/HDFS-6813 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6813.001.patch {{PositionedReadable}} definition requires the implementations for its interfaces should be thread-safe. OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6787) Remove duplicate code in FSDirectory#unprotectedConcat
[ https://issues.apache.org/jira/browse/HDFS-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-6787: -- Resolution: Fixed Fix Version/s: 2.6.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed this to trunk and branch-2. Thanks a lot Yi! Remove duplicate code in FSDirectory#unprotectedConcat -- Key: HDFS-6787 URL: https://issues.apache.org/jira/browse/HDFS-6787 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0, 2.6.0 Attachments: HDFS-6787.001.patch {code} // update inodeMap removeFromInodeMap(Arrays.asList(allSrcInodes)); {code} this snippet of code is duplicate, since we already have the logic above it: {code} for(INodeFile nodeToRemove: allSrcInodes) { if(nodeToRemove == null) continue; nodeToRemove.setBlocks(null); trgParent.removeChild(nodeToRemove, trgLatestSnapshot); inodeMap.remove(nodeToRemove); count++; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6787) Remove duplicate code in FSDirectory#unprotectedConcat
[ https://issues.apache.org/jira/browse/HDFS-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084810#comment-14084810 ] Hudson commented on HDFS-6787: -- FAILURE: Integrated in Hadoop-trunk-Commit #6008 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6008/]) HDFS-6787. Remove duplicate code in FSDirectory#unprotectedConcat. Contributed by Yi Liu. (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615622) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java Remove duplicate code in FSDirectory#unprotectedConcat -- Key: HDFS-6787 URL: https://issues.apache.org/jira/browse/HDFS-6787 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0, 2.6.0 Attachments: HDFS-6787.001.patch {code} // update inodeMap removeFromInodeMap(Arrays.asList(allSrcInodes)); {code} this snippet of code is duplicate, since we already have the logic above it: {code} for(INodeFile nodeToRemove: allSrcInodes) { if(nodeToRemove == null) continue; nodeToRemove.setBlocks(null); trgParent.removeChild(nodeToRemove, trgLatestSnapshot); inodeMap.remove(nodeToRemove); count++; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6757) Simplify lease manager with INodeID
[ https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084900#comment-14084900 ] Daryn Sharp commented on HDFS-6757: --- bq. Should successfully replaying OP_CLOSE require that a corresponding lease be found? It hasn't in the past. [...] I guess perhaps we could say that OP_CLOSE is reserved for normal, clean file close operations As best I can tell we do currently require a lease to close and an edit op is generated only for clean closes. completeFileInternal throws if there is no lease but special logic for rpc retries will discard the lease exception and return success if the client is closing the last block with its current genstamp. However, no edit is generated so it's not a consideration here. A close op may also be generated by commit block sync, but it looks like the NN should have already reassigned the lease to itself during block recovery. I believe the design is a UC file must always have a lease of a client or the NN. The general concern is whether it's prudent to start masking possible bugs during edit replay. It makes me very uncomfortable. I'd rather the NN fail while processing illegal edit sequences because edit bugs can lead to data loss. Otherwise we are implicitly willing to let the standby silently become inconsistent with the active. Simplify lease manager with INodeID --- Key: HDFS-6757 URL: https://issues.apache.org/jira/browse/HDFS-6757 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch Currently the lease manager records leases based on path instead of inode ids. Therefore, the lease manager needs to carefully keep track of the path of active leases during renames and deletes. This can be a non-trivial task. This jira proposes to simplify the logic by tracking leases using inodeids instead of paths. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6815) Verify that alternate access methods work properly with Data at Rest Encryption
Charles Lamb created HDFS-6815: -- Summary: Verify that alternate access methods work properly with Data at Rest Encryption Key: HDFS-6815 URL: https://issues.apache.org/jira/browse/HDFS-6815 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Charles Lamb Assignee: Charles Lamb Verify that alternative access methods (libhdfs, Httpfs, nfsv3) work properly with Data at Rest Encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084936#comment-14084936 ] Brandon Li commented on HDFS-6451: -- +1. The patch looks very nice. NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6451: - Description: As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. was: As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Abhiraj Butala Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6451: - Assignee: Abhiraj Butala NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Abhiraj Butala Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084959#comment-14084959 ] Brandon Li commented on HDFS-6451: -- I've committed the patch. Thank you, [~abutala], for the contribution! NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Abhiraj Butala Fix For: 2.6.0 Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6451: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Abhiraj Butala Fix For: 2.6.0 Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084961#comment-14084961 ] Hudson commented on HDFS-6451: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6009 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6009/]) HDFS-6451. NFS should not return NFS3ERR_IO for AccessControlException. Contributed by Abhiraj Butala (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615702) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestRpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Abhiraj Butala Fix For: 2.6.0 Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6451: - Fix Version/s: 2.6.0 NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Abhiraj Butala Fix For: 2.6.0 Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6451: - Affects Version/s: 2.2.0 NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Abhiraj Butala Fix For: 2.6.0 Attachments: HDFS-6451.002.patch, HDFS-6451.003.patch, HDFS-6451.patch As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_ACCESS instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry McCay updated HDFS-6790: -- Status: Open (was: Patch Available) DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry McCay updated HDFS-6790: -- Status: Patch Available (was: Open) DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6811) Fix bad default value for dfs.namenode.list.encryption.zones.num.responses
[ https://issues.apache.org/jira/browse/HDFS-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084967#comment-14084967 ] Andrew Wang commented on HDFS-6811: --- no worries, I should have filed as subtask :) Fix bad default value for dfs.namenode.list.encryption.zones.num.responses -- Key: HDFS-6811 URL: https://issues.apache.org/jira/browse/HDFS-6811 Project: Hadoop HDFS Issue Type: Bug Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6811.001.patch This came from a typo in HDFS-6780, false is not a valid integer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084973#comment-14084973 ] Hadoop QA commented on HDFS-6634: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659307/inotify-design.2.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7553//console This message is automatically generated. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.patch, inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Thomas updated HDFS-6634: --- Attachment: HDFS-6634.2.patch inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085051#comment-14085051 ] Andrew Wang commented on HDFS-6634: --- Hey James, thanks for posting this. I took a quick look, have some surface level comments: Nits: - Need license headers on new files - Class javadoc for new classes would be nice - InterfaceAudience and InterfaceStability annotations on new classes - Could we pack each SubEvent as a nested class in Event? Then we can javadoc just Event. Could also squeeze EventList in there if you're feeling it. - Javadoc has things like @link for referring to other classes and methods, @return and @param too. Might clarify things, especially for user-facing classes - It'd be good to have an EventsList PB definition so it matches up with the Java classes. Meta: - Need a public class where we can get the new EventStream. My advice is putting it in DistributedFileSystem and then exposing to users in HdfsAdmin. - Slapping a big fat javadoc on the new user method in HdfsAdmin would be good, since it'll need to explain its intended purpose and how to properly use the API. - If you included an event code or event type enum for each event, you could switch over that instead of instaceof in PBHelper. It'd be kind of like how FSEditLogOpCodes work. DFSInotifyEventInputStream - I think it'd be better if we had an API that did a get-with-timeout like {{Future#get}}, seems like what an app would want - I also don't follow why we need resync. If we get an IOException in next(), what can the app do besides resync and retry? - At a high-level, it's be good to think how people will be using this API and dealing with failures. This is a good thing to experiment with in the test cases, forcing certain error conditions and seeing what the user code has to do to handle them. NNRpcServer - We very likely need to do getEditsFromTxid with the FSNamesystem read lock held. However, we also don't want to do any kind of RPC with the FSN lock held. - That while loop condition looks hairy, could you split it up and add comments? - What if a client is way behind when this is called? It looks like it ends up getting all the edits logs, which could be GBs. This API should be batched to only send some configurable # of events. IFSELOTranslator: - Can we do a switch/case on the op's opcode? It's an enum. Faster/better than doing instanceof. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6134: --- Attachment: HDFS-6134.001.patch Submitting a first cut of the branch merge patch just to get a jenkins run going. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6134.001.patch, HDFS-6134_test_plan.pdf, HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthÂcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6694: Attachment: HDFS-6694.001.dbg.patch Upload a version to dump out dbg message. TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Attachments: HDFS-6694.001.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6694: Assignee: Yongjun Zhang Status: Patch Available (was: Open) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6694.001.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry McCay updated HDFS-6790: -- Attachment: HDFS-6790.patch Resubmit for another jenkins run. DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry McCay updated HDFS-6790: -- Status: Patch Available (was: Open) DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry McCay updated HDFS-6790: -- Status: Open (was: Patch Available) DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6134: --- Assignee: Charles Lamb (was: Alejandro Abdelnur) Affects Version/s: 3.0.0 Status: Patch Available (was: Reopened) submitting patch to get a jenkins run. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0, 3.0.0 Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6134.001.patch, HDFS-6134_test_plan.pdf, HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthÂcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6425: -- Attachment: HDFS-6425-Test-Case.pdf HDFS-6425.patch Here is the initial patch. 1. Have HeartbeatManager compute # of stale storages periodically. 2. Have BlockManager's ReplicationMonitor rescan postponedMisreplicatedBlocks only if # of stale storages drops below the defined threshold. 3. Reset postponedMisreplicatedBlocks and postponedMisreplicatedBlocksCount upon fail over. This is to fix the SBN metrics so that the new SBN has metrics value of zero. Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6425: -- Attachment: (was: HDFS-6425.patch) Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6425: -- Status: Patch Available (was: Open) Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085232#comment-14085232 ] Yongjun Zhang commented on HDFS-6776: - The posted patch also supports distcp from secure cluster to insecure cluster (as long as both clusters are hadoop 2, and issue the command from secure cluster side). distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085271#comment-14085271 ] Hadoop QA commented on HDFS-6634: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659683/HDFS-6634.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 12 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7555//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7555//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7555//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7555//console This message is automatically generated. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085267#comment-14085267 ] Hadoop QA commented on HDFS-6634: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659683/HDFS-6634.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 12 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7554//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7554//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7554//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7554//console This message is automatically generated. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6394: -- Attachment: hdfs-6394.001.patch Patch attached. This is a little basic, but it'd be good to get something out there. We can do further improvements in a later JIRA (like adding diagrams and better setup and configuration instructions). HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085278#comment-14085278 ] Hadoop QA commented on HDFS-6425: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659706/HDFS-6425-Test-Case.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7559//console This message is automatically generated. Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085320#comment-14085320 ] Charles Lamb commented on HDFS-6394: Nice writeup [~andrew.wang]. Here are some suggested changes: bq. Once configured, data read from and written to HDFS will be transparently encrypted and decrypted without requiring changes to user application code. s/will be/is/ bq. This encryption is also end-to-end, which means the data can only encrypted and decrypted by the client. s/can only/is only/ bq. HDFS never stores or has access to unencrypted data or data encryption keys. s/stores or has access to/stores, or has access to,/ bq. Having transparent encryption built-in to HDFS makes it easier for organizations to comply with these regulations. s/built-in to/built into/ bq. Encryption can also be done at the application-level, but integrating it into HDFS means that existing HDFS applications can operate on encrypted data without changes. but by integrating it into HDFS, existing HDFS applications can ... bq. Integrating directly into HDFS also means we can provide stronger semantics about the handling of encrypted files, as well as better integration with other HDFS functionality. Integrating directly also means that HDFS can provide stronger... bq. The KMS implements additional functionality which enables creation and decryption of encrypted encryption keys (EEKs). which enables encrypted encryption keys (EEKs) to be created and decrypted. bq. When creating a new EEK, the KMS will generate a new random key, encrypt it with the specified key, and return the EEK to the client. s/will generate/generates/, s/encrypt it/encrypts it/, s/return/returns/ bq. When decrypting an EEK, the KMS will check that the user has access to the encryption key, uses it to decrypt the EEK, and returns the decrypted encryption key. s/will check/checks/ bq. When creating a new file in an encryption zone, the NameNode will ask the KMS to generate a new EDEK encrypted with the encryption zone's key. s/will ask/asks/ bq. Assuming that is successful, the client can finally use the DEK to decrypt the file's contents. s/can finally use/uses/ bq. All of the above steps for the read and write path happens automatically through interactions between the DFSClient, the NameNode, and the KMS. s/happens/happen/ bq. It should be noted that access to encrypted file data and metadata is controlled by normal HDFS filesystem permissions. s/It should be noted that access/Access/ bq. This means compromising HDFS (e.g., gaining access to an HDFS superuser account) allows access to ciphertext and encrypted keys. This means that if the HDFS superuser account is compromised, access is gained to ciphertext and encrypted keys. HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6425: -- Attachment: (was: HDFS-6425.patch) Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6717) Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config
[ https://issues.apache.org/jira/browse/HDFS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6717: - Attachment: HDFS-6717.more-change3.patch Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config --- Key: HDFS-6717 URL: https://issues.apache.org/jira/browse/HDFS-6717 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.4.0 Reporter: Jeff Hansen Assignee: Brandon Li Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6717.001.patch, HDFS-6717.more-change.patch, HDFS-6717.more-change2.patch, HDFS-6717.more-change3.patch, HdfsNfsGateway.html I believe this is just a matter of needing to update documentation. As a result of https://issues.apache.org/jira/browse/HDFS-5804, the secure and unsecure code paths appear to have been merged -- this is great because it means less code to test. However, it means that the default unsecure behavior requires additional configuration that needs to be documented. I'm not the first to have trouble following the instructions documented in http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html I kept hitting a RemoteException with the message that hdfs user cannot impersonate root -- apparently under the old code, there was no impersonation going on, so the nfs3 service could and should be run under the same user id that runs hadoop (I assumed this meant the user id hdfs). However, with the new unified code path, that would require hdfs to be able to impersonate root (because root is always the local user that mounts a drive). The comments in jira hdfs-5804 seem to indicate nobody has a problem with requiring the nfsserver user to impersonate root -- if that means it's necessary for the configuration to include root as a user nfsserver can impersonate, that should be included in the setup instructions. More to the point, it appears to be absolutely necessary now to provision a user named nfsserver in order to be able to give that nfsserver ability to impersonate other users. Alternatively I think we'd need to configure hdfs to be able to proxy other users. I'm not really sure what the best practice should be, but it should be documented since it wasn't needed in the past. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6717) Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config
[ https://issues.apache.org/jira/browse/HDFS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085326#comment-14085326 ] Hudson commented on HDFS-6717: -- FAILURE: Integrated in Hadoop-trunk-Commit #6011 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6011/]) commit the additional doc change for HDFS-6717 (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615801) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config --- Key: HDFS-6717 URL: https://issues.apache.org/jira/browse/HDFS-6717 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.4.0 Reporter: Jeff Hansen Assignee: Brandon Li Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6717.001.patch, HDFS-6717.more-change.patch, HDFS-6717.more-change2.patch, HDFS-6717.more-change3.patch, HdfsNfsGateway.html I believe this is just a matter of needing to update documentation. As a result of https://issues.apache.org/jira/browse/HDFS-5804, the secure and unsecure code paths appear to have been merged -- this is great because it means less code to test. However, it means that the default unsecure behavior requires additional configuration that needs to be documented. I'm not the first to have trouble following the instructions documented in http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html I kept hitting a RemoteException with the message that hdfs user cannot impersonate root -- apparently under the old code, there was no impersonation going on, so the nfs3 service could and should be run under the same user id that runs hadoop (I assumed this meant the user id hdfs). However, with the new unified code path, that would require hdfs to be able to impersonate root (because root is always the local user that mounts a drive). The comments in jira hdfs-5804 seem to indicate nobody has a problem with requiring the nfsserver user to impersonate root -- if that means it's necessary for the configuration to include root as a user nfsserver can impersonate, that should be included in the setup instructions. More to the point, it appears to be absolutely necessary now to provision a user named nfsserver in order to be able to give that nfsserver ability to impersonate other users. Alternatively I think we'd need to configure hdfs to be able to proxy other users. I'm not really sure what the best practice should be, but it should be documented since it wasn't needed in the past. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6717) Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config
[ https://issues.apache.org/jira/browse/HDFS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085331#comment-14085331 ] Brandon Li commented on HDFS-6717: -- I've committed the additional doc change to trunk/branch2/2.5. Thank you, [~dscheffy]. Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config --- Key: HDFS-6717 URL: https://issues.apache.org/jira/browse/HDFS-6717 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.4.0 Reporter: Jeff Hansen Assignee: Brandon Li Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6717.001.patch, HDFS-6717.more-change.patch, HDFS-6717.more-change2.patch, HDFS-6717.more-change3.patch, HdfsNfsGateway.html I believe this is just a matter of needing to update documentation. As a result of https://issues.apache.org/jira/browse/HDFS-5804, the secure and unsecure code paths appear to have been merged -- this is great because it means less code to test. However, it means that the default unsecure behavior requires additional configuration that needs to be documented. I'm not the first to have trouble following the instructions documented in http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html I kept hitting a RemoteException with the message that hdfs user cannot impersonate root -- apparently under the old code, there was no impersonation going on, so the nfs3 service could and should be run under the same user id that runs hadoop (I assumed this meant the user id hdfs). However, with the new unified code path, that would require hdfs to be able to impersonate root (because root is always the local user that mounts a drive). The comments in jira hdfs-5804 seem to indicate nobody has a problem with requiring the nfsserver user to impersonate root -- if that means it's necessary for the configuration to include root as a user nfsserver can impersonate, that should be included in the setup instructions. More to the point, it appears to be absolutely necessary now to provision a user named nfsserver in order to be able to give that nfsserver ability to impersonate other users. Alternatively I think we'd need to configure hdfs to be able to proxy other users. I'm not really sure what the best practice should be, but it should be documented since it wasn't needed in the past. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085342#comment-14085342 ] Andrew Wang commented on HDFS-6394: --- Could you just post a diff with your desired changes? will be easier to review, and I can just apply them. HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085346#comment-14085346 ] Hadoop QA commented on HDFS-6694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659692/HDFS-6694.001.dbg.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7556//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7556//console This message is automatically generated. TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6694.001.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085365#comment-14085365 ] Hadoop QA commented on HDFS-6790: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659697/HDFS-6790.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7557//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7557//console This message is automatically generated. DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6394: --- Attachment: hdfs-6394.002.patch Suggested changes attached. HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch, hdfs-6394.002.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085395#comment-14085395 ] Yongjun Zhang commented on HDFS-6694: - My last test run indicated that the ulimit num of open file is 1024 on the machine {{Slave H5 (Build slave for Hadoop project builds : asf905.gq1.ygridcore.net)}}. However, when I ran the test locally, the num of open files is 4096. {code} YJD ulimit -a contents: time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) 8192 coredump(blocks) 0 memory(kbytes) unlimited locked memory(kbytes) 64 process 386178 nofiles 1024 == vmemory(kbytes) unlimited locks unlimited {code} TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6694.001.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6425: -- Attachment: HDFS-6425.patch Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085408#comment-14085408 ] Larry McCay commented on HDFS-6790: --- Hmmm - failure must be related - I need to investigate further. DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6673: Status: Open (was: Patch Available) Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6537) Tests for Crypto filesystem decorating HDFS
[ https://issues.apache.org/jira/browse/HDFS-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6537: - Issue Type: Test (was: Sub-task) Parent: (was: HDFS-6134) Tests for Crypto filesystem decorating HDFS --- Key: HDFS-6537 URL: https://issues.apache.org/jira/browse/HDFS-6537 Project: Hadoop HDFS Issue Type: Test Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Yi Liu Assignee: Yi Liu Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6537.patch {{CryptoFileSystem}} targets other filesystem. But currently other built-in Hadoop filesystems don't have XAttrs support, so this JIRA uses HDFS for test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085625#comment-14085625 ] Larry McCay commented on HDFS-6790: --- It runs cleanly locally and I don't see any way in which this patch would have affected this test. Going back to my original assertion that it is unrelated. Has this been a flaky test lately? DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6790) DFSUtil Should Use configuration.getPassword for SSL passwords
[ https://issues.apache.org/jira/browse/HDFS-6790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085627#comment-14085627 ] Larry McCay commented on HDFS-6790: --- Okay - seems to be a known issue as being addressed in HDFS-6694. This patch is fine. DFSUtil Should Use configuration.getPassword for SSL passwords -- Key: HDFS-6790 URL: https://issues.apache.org/jira/browse/HDFS-6790 Project: Hadoop HDFS Issue Type: Bug Reporter: Larry McCay Attachments: HDFS-6790.patch, HDFS-6790.patch, HDFS-6790.patch As part of HADOOP-10904, DFSUtil should be changed to leverage the new method on Configuration for acquiring known passwords for SSL. The getPassword method will leverage the credential provider API and/or fallback to the clear text value stored in ssl-server.xml. This will provide an alternative to clear text passwords on disk while maintaining backward compatibility for this behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency
[ https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085630#comment-14085630 ] Hadoop QA commented on HDFS-6425: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659743/HDFS-6425.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7560//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7560//console This message is automatically generated. Large postponedMisreplicatedBlocks has impact on blockReport latency Key: HDFS-6425 URL: https://issues.apache.org/jira/browse/HDFS-6425 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch Sometimes we have large number of over replicates when NN fails over. When the new active NN took over, over replicated blocks will be put to postponedMisreplicatedBlocks until all DNs for that block aren't stale anymore. We have a case where NNs flip flop. Before postponedMisreplicatedBlocks became empty, NN fail over again and again. So postponedMisreplicatedBlocks just kept increasing until the cluster is stable. In addition, large postponedMisreplicatedBlocks could make rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks takes write lock. So it could slow down the block report processing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6394 started by Andrew Wang. HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch, hdfs-6394.002.patch, hdfs-6394.003.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6394) HDFS encryption documentation
[ https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6394: -- Attachment: hdfs-6394.003.patch Nice, thanks Charles. I took almost all of your edits with a few additional tweaks, lemme know if it looks good. HDFS encryption documentation - Key: HDFS-6394 URL: https://issues.apache.org/jira/browse/HDFS-6394 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: hdfs-6394.001.patch, hdfs-6394.002.patch, hdfs-6394.003.patch Documentation for HDFS encryption behavior and configuration -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085634#comment-14085634 ] Tsz Wo Nicholas Sze commented on HDFS-6813: --- I think requiring PositionedReadable to be thread-safe may be overkill. I believe few user applications need thread-safe and, if it is needed, it is very easy to be synchronized in the user applications. Just have checked Java InputStream and DataInputStream, they are not enforcing thread-safe. The [DataInputStream javadoc|http://docs.oracle.com/javase/7/docs/api/java/io/DataInputStream.html] even emphasizes that it is not thread-safe: bq. DataInputStream is not necessarily safe for multithreaded access. Thread safety is optional and is the responsibility of users of methods in this class. I suggest simple update the javadoc of PositionedReadable. What do you think? WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe. --- Key: HDFS-6813 URL: https://issues.apache.org/jira/browse/HDFS-6813 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6813.001.patch {{PositionedReadable}} definition requires the implementations for its interfaces should be thread-safe. OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085643#comment-14085643 ] Larry McCay commented on HDFS-6694: --- I am seeing this with my patch for HDFS-6790 {code} Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 82.647 sec FAILURE! - in org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover testPipelineRecoveryStress(org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover) Time elapsed: 33.705 sec ERROR! java.lang.RuntimeException: Deferred at org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130) at org.apache.hadoop.test.MultithreadedTestUtil$TestContext.waitFor(MultithreadedTestUtil.java:121) at org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testPipelineRecoveryStress(TestPipelinesFailover.java:456) Caused by: org.apache.hadoop.ipc.RemoteException: File /test-7 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1486) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2801) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:613) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:462) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093) at org.apache.hadoop.ipc.Client.call(Client.java:1411) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy17.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:372) at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1442) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1265) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:521) {code} TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6694.001.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085649#comment-14085649 ] Hadoop QA commented on HDFS-6134: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659689/HDFS-6134.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 38 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1262 javac compiler warnings (more than the trunk's current 1259 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 9 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.ha.TestZKFailoverControllerStress org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7558//console This message is automatically generated. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 3.0.0, 2.3.0 Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6134.001.patch, HDFS-6134_test_plan.pdf, HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthÂcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-4754: - Target Version/s: 3.0.0, 2.5.0 Fix Version/s: (was: 2.5.0) (was: 3.0.0) Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6636) NameNode should remove block replica out from corrupted replica map when adding block under construction
[ https://issues.apache.org/jira/browse/HDFS-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gordon Wang updated HDFS-6636: -- Attachment: (was: TestCompleteFileWithCorruptBlock.java) NameNode should remove block replica out from corrupted replica map when adding block under construction Key: HDFS-6636 URL: https://issues.apache.org/jira/browse/HDFS-6636 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Gordon Wang In our test environment, we found the namenode can not handle incremental block report correctly when the block replica is under construction and the replica is marked as corrupt. Here is our scenario. * the block had 3 replica by default. But because one datanode was down, the available replica for the block was 2. Say the alive datanode is DN1 and DN2. * client tried to append data to the block. And during appending, something was wrong with the pipeline. Then, client did the pipeline recovery, only one datanode DN1 is in the pipeline now. * For some unknown reason(might be the IO error), DN2 got checksum error when receiving block data from DN1, then DN2 reported the replica on DN1 as bad block to NameNode. But actually, client was appending data to replica on DN1, and the replica is good. * NameNode marked replica on DN1 as corrupt. * When client finished appending, DN1 checked the data in the replica, and the replica is OK. Then, DN1 finalized the replica, DN1 reported the block as received block to NameNode. * NameNode handled the incremental block report form DN1, because the block is under construction. NameNode called the addStoredBlockUnderConstruction in block manager. But as the replica on DN1 was never removed from the corrupted block. The number of alive replica for the block was 0, and the number of corrupt replica was 1. * client could not complete the file because the number of alive replicas for the last block was smaller than minimal replica number. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085718#comment-14085718 ] Hadoop QA commented on HDFS-4754: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596540/4754.v4.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7561//console This message is automatically generated. Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6816) Standby NN new webUI Startup Progress tab displays Failed to retrieve data from ... error
Ming Ma created HDFS-6816: - Summary: Standby NN new webUI Startup Progress tab displays Failed to retrieve data from ... error Key: HDFS-6816 URL: https://issues.apache.org/jira/browse/HDFS-6816 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Standby NN webUI dfshealth.html#tab-startup-progress will display Failed to retrieve data from ... message. -- This message was sent by Atlassian JIRA (v6.2#6252)