[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones
[ https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637050#comment-14637050 ] Hudson commented on HDFS-8721: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) HDFS-8721. Add a metric for number of encryption zones. Contributed by Rakesh R. (cnauroth: rev cb03768b1b2250b9b5a7944cf6ef918e8a974e20) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md Add a metric for number of encryption zones --- Key: HDFS-8721 URL: https://issues.apache.org/jira/browse/HDFS-8721 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch Would be good to expose the number of encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637052#comment-14637052 ] Hudson commented on HDFS-8795: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) HDFS-8795. Improve InvalidateBlocks#node2blocks. (yliu) (yliu: rev 4025326288c0167ff300d4f7ecc96f84ed141912) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637056#comment-14637056 ] Hudson commented on HDFS-8344: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) Revert HDFS-8344. NameNode doesn't recover lease for files with missing blocks (raviprak) (raviprak: rev 5137b388fc9d4d716f780daf6d04292feeb9df96) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
[ https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637046#comment-14637046 ] Hudson commented on HDFS-8773: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) HDFS-8773. Few FSNamesystem metrics are not documented in the Metrics page. Contributed by Rakesh R. (cnauroth: rev a26cc66f38daec2342215a66b599bf59cee1112c) * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Few FSNamesystem metrics are not documented in the Metrics page --- Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8773-00.patch This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8495) Consolidate append() related implementation into a single class
[ https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637054#comment-14637054 ] Hudson commented on HDFS-8495: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) HDFS-8495. Consolidate append() related implementation into a single class. Contributed by Rakesh R. (wheat9: rev 31f117138a00794de4951ee8433e304d72b04094) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirTruncateOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Consolidate append() related implementation into a single class --- Key: HDFS-8495 URL: https://issues.apache.org/jira/browse/HDFS-8495 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch This jira proposes to consolidate {{FSNamesystem#append()}} related methods into a single class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8800) hdfs --daemon stop namenode corrupts logs
[ https://issues.apache.org/jira/browse/HDFS-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637095#comment-14637095 ] Hudson commented on HDFS-8800: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) HDFS-8800. hdfs --daemon stop namenode corrupts logs (John Smith via aw) (aw: rev c9507fe6c12491f3aef5cd4142b4d466bd6b71c3) * hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hdfs --daemon stop namenode corrupts logs - Key: HDFS-8800 URL: https://issues.apache.org/jira/browse/HDFS-8800 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: John Smith Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-8800 namenode stop creates stack traces and extra gc logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: HDFS-6407.4.patch [~wheat9], could you please take a look at my patch and provide me some feedback. I still use the plugin datatable, but I work around the sorting concerns you have by not let dust template do the parsing, and let datatable use the raw data to sort and display data with unit. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637138#comment-14637138 ] Hadoop QA commented on HDFS-6407: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | release audit | 0m 17s | The applied patch generated 2 release audit warnings. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | | | 0m 20s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746563/HDFS-6407.4.patch | | Optional Tests | | | git revision | trunk / 4025326 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11788/artifact/patchprocess/patchReleaseAuditProblems.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11788/artifact/patchprocess/whitespace.txt | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11788/console | This message was automatically generated. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8800) hdfs --daemon stop namenode corrupts logs
[ https://issues.apache.org/jira/browse/HDFS-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637057#comment-14637057 ] Hudson commented on HDFS-8800: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/]) HDFS-8800. hdfs --daemon stop namenode corrupts logs (John Smith via aw) (aw: rev c9507fe6c12491f3aef5cd4142b4d466bd6b71c3) * hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hdfs --daemon stop namenode corrupts logs - Key: HDFS-8800 URL: https://issues.apache.org/jira/browse/HDFS-8800 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: John Smith Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-8800 namenode stop creates stack traces and extra gc logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637090#comment-14637090 ] Hudson commented on HDFS-8795: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) HDFS-8795. Improve InvalidateBlocks#node2blocks. (yliu) (yliu: rev 4025326288c0167ff300d4f7ecc96f84ed141912) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637094#comment-14637094 ] Hudson commented on HDFS-8344: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) Revert HDFS-8344. NameNode doesn't recover lease for files with missing blocks (raviprak) (raviprak: rev 5137b388fc9d4d716f780daf6d04292feeb9df96) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
[ https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637084#comment-14637084 ] Hudson commented on HDFS-8773: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) HDFS-8773. Few FSNamesystem metrics are not documented in the Metrics page. Contributed by Rakesh R. (cnauroth: rev a26cc66f38daec2342215a66b599bf59cee1112c) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md Few FSNamesystem metrics are not documented in the Metrics page --- Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8773-00.patch This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8495) Consolidate append() related implementation into a single class
[ https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637092#comment-14637092 ] Hudson commented on HDFS-8495: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) HDFS-8495. Consolidate append() related implementation into a single class. Contributed by Rakesh R. (wheat9: rev 31f117138a00794de4951ee8433e304d72b04094) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirTruncateOp.java Consolidate append() related implementation into a single class --- Key: HDFS-8495 URL: https://issues.apache.org/jira/browse/HDFS-8495 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch This jira proposes to consolidate {{FSNamesystem#append()}} related methods into a single class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones
[ https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637088#comment-14637088 ] Hudson commented on HDFS-8721: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/]) HDFS-8721. Add a metric for number of encryption zones. Contributed by Rakesh R. (cnauroth: rev cb03768b1b2250b9b5a7944cf6ef918e8a974e20) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Add a metric for number of encryption zones --- Key: HDFS-8721 URL: https://issues.apache.org/jira/browse/HDFS-8721 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch Would be good to expose the number of encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8790) Add Filesystem level stress tests
[ https://issues.apache.org/jira/browse/HDFS-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637340#comment-14637340 ] Steve Loughran commented on HDFS-8790: -- James, it's about concurrency, specifically, # if positioned read is a seek+read+seek-back sequence, is that intermediate position visible # do 1 operations interfere this situation arose in encrypted HDFS and HBase Add Filesystem level stress tests - Key: HDFS-8790 URL: https://issues.apache.org/jira/browse/HDFS-8790 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: James Clampffer Assignee: James Clampffer I propose adding stress tests on libhdfs(3) compatibility layer was well as the async calls. These can be also used for basic performance metrics and inputs to profiling tools to see improvements over time. I'd like to make these tests into a seperate executable, or set of them, so that they can be used for longer running tests on dedicated clusters that may already exist. Each should provide a simple command line interface for scripted or manual use. Basic tests would be: looped open-read-close sequential scans small random reads All tests will be parameterized for number of threads, read size, and upper and lower offset bounds for a specified file. This will make it much easier to detect and reproduce threading issues and resource leaks as well as provide a simple executable (or set of executables) that can be run with valgrind to gain a high confidence that the code is operating correctly. I'd appreciate suggestions for any other simple stress tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread
[ https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8797: -- Hadoop Flags: Reviewed +1 the new patch looks good. WebHdfsFileSystem creates too many connections for pread Key: HDFS-8797 URL: https://issues.apache.org/jira/browse/HDFS-8797 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8797.000.patch, HDFS-8797.001.patch, HDFS-8797.002.patch, HDFS-8797.003.patch While running a test we found that WebHdfsFileSystem can create several thousand connections when doing a position read of a 200MB file. For each connection the client will connect to the DataNode again and the DataNode will create a new DFSClient instance to handle the read request. This also leads to several thousand {{getBlockLocations}} call to the NameNode. The cause of the issue is that in {{FSInputStream#read(long, byte[], int, int)}}, each time the inputstream reads some time, it seeks back to the old position and resets its state to SEEK. Thus the next read will regenerate the connection. {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { synchronized (this) { long oldPos = getPos(); int nread = -1; try { seek(position); nread = read(buffer, offset, length); } finally { seek(oldPos); } return nread; } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread
[ https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8797: Attachment: HDFS-8797.003.patch Yeah, it will also be good to keep read/readFully consistent. Update the patch to override {{read}} in {{ByteRangeInputStream}}. WebHdfsFileSystem creates too many connections for pread Key: HDFS-8797 URL: https://issues.apache.org/jira/browse/HDFS-8797 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8797.000.patch, HDFS-8797.001.patch, HDFS-8797.002.patch, HDFS-8797.003.patch While running a test we found that WebHdfsFileSystem can create several thousand connections when doing a position read of a 200MB file. For each connection the client will connect to the DataNode again and the DataNode will create a new DFSClient instance to handle the read request. This also leads to several thousand {{getBlockLocations}} call to the NameNode. The cause of the issue is that in {{FSInputStream#read(long, byte[], int, int)}}, each time the inputstream reads some time, it seeks back to the old position and resets its state to SEEK. Thus the next read will regenerate the connection. {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { synchronized (this) { long oldPos = getPos(); int nread = -1; try { seek(position); nread = read(buffer, offset, length); } finally { seek(oldPos); } return nread; } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml
[ https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637230#comment-14637230 ] Anu Engineer commented on HDFS-8753: +1, changes look good to me. I will commit shortly. Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml --- Key: HDFS-8753 URL: https://issues.apache.org/jira/browse/HDFS-8753 Project: Hadoop HDFS Issue Type: Sub-task Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Attachments: HDFS-8753-HDFS-7240.00.patch This JIRA proposes adding ozone-default.xml to main resources ozone-site.xml to test resources with default known parameters as of now. Also, need to unify {{StorageContainerConfiguration}} to initialize conf with both the files as at present there are two classes with this name. {code} hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java loads only ozone-site.xml hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java loads only storage-container-site.xml {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637283#comment-14637283 ] Hadoop QA commented on HDFS-8802: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 46s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 3m 1s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 20s | Tests failed in hadoop-hdfs. | | | | 197m 42s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746541/HDFS-8802.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 4025326 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11785/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11785/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11785/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11785/console | This message was automatically generated. dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty Attachments: HDFS-8802.patch It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637392#comment-14637392 ] Hadoop QA commented on HDFS-8622: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 4s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 0s | Site still builds. | | {color:green}+1{color} | checkstyle | 0m 37s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 51s | Tests failed in hadoop-hdfs. | | | | 207m 46s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746553/HDFS-8622-03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 4025326 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11786/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11786/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11786/console | This message was automatically generated. Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637207#comment-14637207 ] Haohui Mai commented on HDFS-6407: -- bq. I still use the plugin datatable... I made it quite explicit why using datatable is a bad idea. Please see https://issues.apache.org/jira/browse/HDFS-6407?focusedCommentId=14232267page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14232267 The key issue here is that the view does not map directly to the data.The only reasonable approach to me is to sort internally and to generate the data to power the views, but not working around and parsing data pretty-formatted data and sorting them. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Attachment: HDFS-8803.002.patch Move DfsClientConf to hdfs-client - Key: HDFS-8803 URL: https://issues.apache.org/jira/browse/HDFS-8803 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, HDFS-8803.002.patch This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637279#comment-14637279 ] Hadoop QA commented on HDFS-8344: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 26m 19s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 48s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 12m 26s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 48s | The applied patch generated 4 new checkstyle issues (total was 432, now 434). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 55s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 42s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 57s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 1s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 89m 18s | Tests failed in hadoop-hdfs. | | | | 148m 44s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStorageStateRecovery | | | org.apache.hadoop.hdfs.TestPread | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746550/HDFS-8344.08.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4025326 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11787/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11787/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11787/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11787/console | This message was automatically generated. NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4
[ https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637440#comment-14637440 ] Colin Patrick McCabe commented on HDFS-8791: bq. Not a linux filesystem expert, but here's where I think the confusion is... Thanks for the explanation. It is great that you used blktrace as well... very good information. bq. I'm a little confused about iterating to find the meta file. Don't we already keep track of the genstamp we discovered during startup? If so, it seems like a simple stat is sufficient. That's a fair point. There are a lot of cases where we don't scan the directory because we have cached the genstamp value. This corresponds to calls to {{FsDatasetUtil#getMetaFile}}. However, there are a few other cases like {{DataNode#transferReplicaForPipelineRecovery}} and {{VolumeScanner#scanBlock}} which do end up calling {{FsDatasetUtil#findMetaFile}}. If we moved to really big directories, we might need to somehow avoid all of those cases. bq. I haven't tried xfs, but that would also be a REALLY heavy hammer in our case I think most people would consider a layout version upgrade a heavier hammer than using XFS... but maybe I'm wrong :) I would actually really like to know if this problem affects XFS too, or if it manages the cache in a different way. I guess that information might be tough to get, since you'd have to reformat everything. If you want to experiment with changing the HDFS sharding, you should be able to just change {{DatanodeUtil#idToBlockDir}}. I am curious how well a simple 1-level sharding scheme would work on ext4. Of course, you'd also have to come up with an upgrade process to the new layout version... block ID-based DN storage layout can be very slow for datanode on ext4 -- Key: HDFS-8791 URL: https://issues.apache.org/jira/browse/HDFS-8791 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Priority: Critical We are seeing cases where the new directory layout causes the datanode to basically cause the disks to seek for 10s of minutes. This can be when the datanode is running du, and it can also be when it is performing a checkDirs(). Both of these operations currently scan all directories in the block pool and that's very expensive in the new layout. The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K leaf directories where block files are placed. So, what we have on disk is: - 256 inodes for the first level directories - 256 directory blocks for the first level directories - 256*256 inodes for the second level directories - 256*256 directory blocks for the second level directories - Then the inodes and blocks to store the the HDFS blocks themselves. The main problem is the 256*256 directory blocks. inodes and dentries will be cached by linux and one can configure how likely the system is to prune those entries (vfs_cache_pressure). However, ext4 relies on the buffer cache to cache the directory blocks and I'm not aware of any way to tell linux to favor buffer cache pages (even if it did I'm not sure I would want it to in general). Also, ext4 tries hard to spread directories evenly across the entire volume, this basically means the 64K directory blocks are probably randomly spread across the entire disk. A du type scan will look at directories one at a time, so the ioscheduler can't optimize the corresponding seeks, meaning the seeks will be random and far. In a system I was using to diagnose this, I had 60K blocks. A DU when things are hot is less than 1 second. When things are cold, about 20 minutes. How do things get cold? - A large set of tasks run on the node. This pushes almost all of the buffer cache out, causing the next DU to hit this situation. We are seeing cases where a large job can cause a seek storm across the entire cluster. Why didn't the previous layout see this? - It might have but it wasn't nearly as pronounced. The previous layout would be a few hundred directory blocks. Even when completely cold, these would only take a few a hundred seeks which would mean single digit seconds. - With only a few hundred directories, the odds of the directory blocks getting modified is quite high, this keeps those blocks hot and much less likely to be evicted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: HDFS-6407.5.patch new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637499#comment-14637499 ] Chang Li commented on HDFS-6407: bq. parsing data pretty-formatted data and sorting them I do not parse the formatted the data and then sort them. currently dust template pull those data and format them, but I disable the format in my patch, {code} -td{capacity|fmt_bytes}/td -td{used|fmt_bytes}/td -td{nonDfsUsedSpace|fmt_bytes}/td -td{remaining|fmt_bytes}/td +td{capacity}/td +td{used}/td +td{nonDfsUsedSpace}/td +td{remaining}/td td{numBlocks}/td -td{blockPoolUsed|fmt_bytes} ({blockPoolUsedPercent|fmt_percentage})/td +td{blockPoolUsed} ({blockPoolUsedPercent|fmt_percentage})/td {code}. Thus data pulled by dust is stored in their raw formmat as in jmx in html table. So dataTable sort those same data as in jmx. When dataTable display those data, I use the render function of dataTable, which is able to take input data in the table and parse and display the formmatted the value of file size without changing the long formmatted data stored in html table. {code} +render: function ( data, type, full, meta) { + var colIndex = meta.col; + var v; + if (colIndex == 8) { +var comp = data; +var res = comp.split( ); +var percent = res[1]; +v = res[0]; + } else { +v = data; + } + + if (type == 'display') { +//var v = data; +var UNITS = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'ZB']; +var prev = 0, i = 0; +while (Math.floor(v) 0 i UNITS.length) { + prev = v; + v /= 1024; + i += 1; +} + +if (i 0 i UNITS.length) { + v = prev; + i -= 1; +} +var size = Math.round(v * 100) / 100 + ' ' + UNITS[i]; +if (colIndex == 8) { + return size + ' ' + percent; +} else { + return size; +} + } + return v; +} + }] + }); {code} I see you are worried about the error prone parsing issue, but I don't see it exist in my patch. Also loing the ability to sort in 2.7 and lost legacy UI is preventing my company from using 2.7. I think we should treat this issue as critical and not postpone it anymore. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637504#comment-14637504 ] Chang Li commented on HDFS-6407: [~wheat9] new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Priority: Critical (was: Minor) new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8806) Inconsistent metrics: number of missing blocks with replication factor 1 not properly cleared
Zhe Zhang created HDFS-8806: --- Summary: Inconsistent metrics: number of missing blocks with replication factor 1 not properly cleared Key: HDFS-8806 URL: https://issues.apache.org/jira/browse/HDFS-8806 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.1 Reporter: Zhe Zhang Assignee: Zhe Zhang HDFS-7165 introduced a new metric for _number of missing blocks with replication factor 1_. It is maintained as {{UnderReplicatedBlocks#corruptReplOneBlocks}}. However, that variable is not reset when other {{UnderReplicatedBlocks}} are cleared. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8805) Archival Storage: it is unable to get StoragePolicy of a directory
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637225#comment-14637225 ] Jing Zhao commented on HDFS-8805: - Thanks for reporting the issue, Hui! The reason of the issue is that currently we only expose the storage policy information to super user ({{FSDirStatAndListing#getFileInfo}}). I think it's ok to remove this restriction. Archival Storage: it is unable to get StoragePolicy of a directory -- Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover, namenode Reporter: Hui Zheng Assignee: Brahma Reddy Battula Fix For: 2.6.0 The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml
[ https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-8753: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7240 Status: Resolved (was: Patch Available) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml --- Key: HDFS-8753 URL: https://issues.apache.org/jira/browse/HDFS-8753 Project: Hadoop HDFS Issue Type: Sub-task Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Fix For: HDFS-7240 Attachments: HDFS-8753-HDFS-7240.00.patch This JIRA proposes adding ozone-default.xml to main resources ozone-site.xml to test resources with default known parameters as of now. Also, need to unify {{StorageContainerConfiguration}} to initialize conf with both the files as at present there are two classes with this name. {code} hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java loads only ozone-site.xml hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java loads only storage-container-site.xml {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8735) Inotify : All events classes should implement toString() API.
[ https://issues.apache.org/jira/browse/HDFS-8735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637322#comment-14637322 ] Colin Patrick McCabe commented on HDFS-8735: Good idea... I think the subclasses should have a \@Public annotation. I think all the functions are stable except for toString. Inotify : All events classes should implement toString() API. - Key: HDFS-8735 URL: https://issues.apache.org/jira/browse/HDFS-8735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Attachments: HDFS-8735.01.patch, HDFS-8735.patch Event classes is used by client, it’s good to implement toString() API. {code} for(Event event : events){ System.out.println(event.toString()); } {code} This will give output like this {code} org.apache.hadoop.hdfs.inotify.Event$CreateEvent@6916d97d {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637494#comment-14637494 ] Jing Zhao commented on HDFS-7858: - Thanks for updating the patch, [~asuresh]! The current approach looks good to me. Some quick comments about the patch: # In {{RequestHedgingInvocationHandler#invoke}}, instead of polling the tasks every 10ms, can we use {{CompletionService}} here? # For {{RequestHedgingProxyProvider#performFailover}}, if the original successfulProxy is not null, we can exclude it for the next time retry. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Attachment: HDFS-8803.001.patch Move DfsClientConf to hdfs-client - Key: HDFS-8803 URL: https://issues.apache.org/jira/browse/HDFS-8803 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, HDFS-8803.001.patch This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Attachment: HDFS-8803.001.patch Move DfsClientConf to hdfs-client - Key: HDFS-8803 URL: https://issues.apache.org/jira/browse/HDFS-8803 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, HDFS-8803.001.patch This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6682) Add a metric to expose the timestamp of the oldest under-replicated block
[ https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6682: Attachment: HDFS-6682.004.patch 004 patch * Rebased for the latest trunk. * Use {{LinkedHashMap}} to keep the insertion order. That way we can avoid calling {{Collections.min}} to get the smallest timestamp. Add a metric to expose the timestamp of the oldest under-replicated block - Key: HDFS-6682 URL: https://issues.apache.org/jira/browse/HDFS-6682 Project: Hadoop HDFS Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: BB2015-05-TBR Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, HDFS-6682.004.patch, HDFS-6682.patch In the following case, the data in the HDFS is lost and a client needs to put the same file again. # A Client puts a file to HDFS # A DataNode crashes before replicating a block of the file to other DataNodes I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block. That way client can know what file to retain for the re-try. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6682) Add a metric to expose the timestamp of the oldest under-replicated block
[ https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6682: Attachment: HDFS-6682.005.patch 005 patch * Updated the document. Add a metric to expose the timestamp of the oldest under-replicated block - Key: HDFS-6682 URL: https://issues.apache.org/jira/browse/HDFS-6682 Project: Hadoop HDFS Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: BB2015-05-TBR Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, HDFS-6682.004.patch, HDFS-6682.005.patch, HDFS-6682.patch In the following case, the data in the HDFS is lost and a client needs to put the same file again. # A Client puts a file to HDFS # A DataNode crashes before replicating a block of the file to other DataNodes I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block. That way client can know what file to retain for the re-try. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-3767) Finer grained locking in DN
[ https://issues.apache.org/jira/browse/HDFS-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Zhang reassigned HDFS-3767: Assignee: Yong Zhang Finer grained locking in DN --- Key: HDFS-3767 URL: https://issues.apache.org/jira/browse/HDFS-3767 Project: Hadoop HDFS Issue Type: Improvement Components: performance Affects Versions: 3.0.0 Reporter: Todd Lipcon Assignee: Yong Zhang In testing a high-write-throughput workload, I see the DN maintain good performance most of the time, except that occasionally one thread will block for a few seconds in {{finalizeReplica}}. It does so holding the FSDatasetImpl lock, which causes all other writer threads to block behind it. HDFS-1148 (making it a rw lock) would help here, but a bigger help would be to go do finer-grained locking (eg per block or per-subdir). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8795: - Resolution: Fixed Hadoop Flags: Reviewed Target Version/s: 2.8.0 (was: 2.7.1) Status: Resolved (was: Patch Available) Thanks [~vinayrpet] for review. The test failure is not related, committed to trunk and branch-2. Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8795: - Fix Version/s: 2.8.0 Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636441#comment-14636441 ] Hudson commented on HDFS-8795: -- FAILURE: Integrated in Hadoop-trunk-Commit #8200 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8200/]) HDFS-8795. Improve InvalidateBlocks#node2blocks. (yliu) (yliu: rev 4025326288c0167ff300d4f7ecc96f84ed141912) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7541) Upgrade Domains in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-7541: -- Attachment: UpgradeDomains_Design_v3.pdf Thanks [~eddyxu]! These are very good points. Here is the updated design doc that answers some of your questions in details. Please find specific replies below. bq. How about call it Availability Domain Availability might be too general in this context. The service can become unavailable due to unplanned event such as TOR outage or planned maintenance such as software upgrade. Both can impact the availability. If we define Availability Domain as if all machines in that domain aren't available, the service can still function, then machines belonging to one rack can also be considered in one availability domain. bq. Is this upgrade domain on each DN a soft state or a hard state? It is a hard state, just like network location of the node. While admins likely keep upgrade domain unchanged during common operations; the design allows admins to move machines around as long as the machines are decommissioned properly at the first place and thus when machines rejoin under different upgrade domains, the proper replica will be removed. The updated design doc provides more details on this. bq. What do you anticipate as a good strategy to choose upgrade domains UDs? Updated design doc has more on this. The number of upgrade domains has impact on data loss, replica recovery time and rolling upgrade parallelism. bq. Regarding the performance impact # of racks is in the order of 100, # of upgrade domains is in the ballpark of 40, # of addBlocks operation is around 1000 ops / sec at leak. bq. In design v2.pdf, could you mind to rephrase the process of Replica delete operation? Updated design adds more description. bq. The last one maybe not relevant: would this design work well with erasure coding (HDFS-7285)? Similar question was asked in HDFS-7613, how we can reuse different block placement policies. Like you said, we can address this issue separately. Upgrade Domains in HDFS --- Key: HDFS-7541 URL: https://issues.apache.org/jira/browse/HDFS-7541 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Attachments: HDFS-7541-2.patch, HDFS-7541.patch, SupportforfastHDFSdatanoderollingupgrade.pdf, UpgradeDomains_Design_v3.pdf, UpgradeDomains_design_v2.pdf Current HDFS DN rolling upgrade step requires sequential DN restart to minimize the impact on data availability and read/write operations. The side effect is longer upgrade duration for large clusters. This might be acceptable for DN JVM quick restart to update hadoop code/configuration. However, for OS upgrade that requires machine reboot, the overall upgrade duration will be too long if we continue to do sequential DN rolling restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Zhang reassigned HDFS-1148: Assignee: Yong Zhang (was: Dave Thompson) Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Yong Zhang Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread
[ https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636479#comment-14636479 ] Yi Liu commented on HDFS-8797: -- {quote} readFully call read repeated so that it is a problem. read itself seems fine. Yeah, looks like the main issue is with readFully here. So currently I keep the original read unchanged. {quote} Sorry, I was dizzy here, {{readFully}} is the main issue, but is it a bit more efficient if we use the same way for normal pread? # The new approach hear is to open a separate new connection for pread, then close it after finish. When client does stateful read again, original connection is not affected. # {{seek}} + {{read}} + {{seek}}: will close the original connection hold by stateful read, and open a new connection for pread. But when client does stateful read again, connection should be close and open again. So #2 ({{seek}} + {{read}} + {{seek}}) requires additional one more close/open connection for normal pread? WebHdfsFileSystem creates too many connections for pread Key: HDFS-8797 URL: https://issues.apache.org/jira/browse/HDFS-8797 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8797.000.patch, HDFS-8797.001.patch, HDFS-8797.002.patch While running a test we found that WebHdfsFileSystem can create several thousand connections when doing a position read of a 200MB file. For each connection the client will connect to the DataNode again and the DataNode will create a new DFSClient instance to handle the read request. This also leads to several thousand {{getBlockLocations}} call to the NameNode. The cause of the issue is that in {{FSInputStream#read(long, byte[], int, int)}}, each time the inputstream reads some time, it seeks back to the old position and resets its state to SEEK. Thus the next read will regenerate the connection. {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { synchronized (this) { long oldPos = getPos(); int nread = -1; try { seek(position); nread = read(buffer, offset, length); } finally { seek(oldPos); } return nread; } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml
[ https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636355#comment-14636355 ] kanaka kumar avvaru commented on HDFS-8753: --- Thanks for firing new build [~anu], findbug failure appears again. BTW, my previous comment was related to broken {{hadoop-hdfs}} build during test. As some of {{.proto}} files moved to {{hadoop-hdfs-client}}, hadoop-hdfs compilation may have failed because client project is not built along with hadoop-hdfs. Anyway it seems compilation is fine now. The test failures on latest build looks not related to this patch code changes. Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml --- Key: HDFS-8753 URL: https://issues.apache.org/jira/browse/HDFS-8753 Project: Hadoop HDFS Issue Type: Sub-task Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Attachments: HDFS-8753-HDFS-7240.00.patch This JIRA proposes adding ozone-default.xml to main resources ozone-site.xml to test resources with default known parameters as of now. Also, need to unify {{StorageContainerConfiguration}} to initialize conf with both the files as at present there are two classes with this name. {code} hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java loads only ozone-site.xml hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java loads only storage-container-site.xml {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Attachment: (was: HDFS-8803.001.patch) Move DfsClientConf to hdfs-client - Key: HDFS-8803 URL: https://issues.apache.org/jira/browse/HDFS-8803 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3570) Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-3570: Attachment: HDFS-3570.003.patch Rebased for the latest trunk. (without test) Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space Key: HDFS-3570 URL: https://issues.apache.org/jira/browse/HDFS-3570 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Akira AJISAKA Priority: Minor Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, HDFS-3570.aash.1.patch Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, post archived at http://pastebin.com/eVFkk0A0 This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very little DFS usage (which is computed against total possible capacity). Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8% only, its got a lot of free space to write more blocks, when that isn't true as shown by the case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply can't accept any more blocks as a result of its disks' state. I think it would be better if we _computed_ the actual utilization based on {{(100-(actual remaining space))/(capacity)}}, as opposed to the current {{(dfs used)/(capacity)}}. Thoughts? This isn't very critical, however, cause it is very rare to see DN space being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636645#comment-14636645 ] Walter Su commented on HDFS-8769: - Contiguous version has a test {{testTriggerBlockIdCollision()}}. The meaning of the test is to avoid collision when the id is occupied by legacy block (which has randomly generated id). the randomly generated id could be negative. So, could you add a test case that The id is occupied by contiguous file? You might have to hack {{BlockIdGenerator}} so that you could create a contiguous file with negative ID. The 2 test cases in 00 patch looks good. Thanks, [~rakeshr]! Erasure Coding: unit test for SequentialBlockGroupIdGenerator - Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Rakesh R Attachments: HDFS-8769-HDFS-7285-00.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636600#comment-14636600 ] Hadoop QA commented on HDFS-8803: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 47s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 40 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 23s | The applied patch generated 6 new checkstyle issues (total was 41, now 46). | | {color:red}-1{color} | whitespace | 0m 30s | The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 1s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 48s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 26s | Tests passed in hadoop-hdfs-client. | | | | 210m 2s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746475/HDFS-8803.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 94c6a4a | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11781/console | This message was automatically generated. Move DfsClientConf to hdfs-client - Key: HDFS-8803 URL: https://issues.apache.org/jira/browse/HDFS-8803 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
[ https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636670#comment-14636670 ] Hudson commented on HDFS-8773: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) HDFS-8773. Few FSNamesystem metrics are not documented in the Metrics page. Contributed by Rakesh R. (cnauroth: rev a26cc66f38daec2342215a66b599bf59cee1112c) * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Few FSNamesystem metrics are not documented in the Metrics page --- Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8773-00.patch This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636676#comment-14636676 ] Hudson commented on HDFS-8795: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) HDFS-8795. Improve InvalidateBlocks#node2blocks. (yliu) (yliu: rev 4025326288c0167ff300d4f7ecc96f84ed141912) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636680#comment-14636680 ] Hudson commented on HDFS-8344: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) Revert HDFS-8344. NameNode doesn't recover lease for files with missing blocks (raviprak) (raviprak: rev 5137b388fc9d4d716f780daf6d04292feeb9df96) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8495) Consolidate append() related implementation into a single class
[ https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636678#comment-14636678 ] Hudson commented on HDFS-8495: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) HDFS-8495. Consolidate append() related implementation into a single class. Contributed by Rakesh R. (wheat9: rev 31f117138a00794de4951ee8433e304d72b04094) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirTruncateOp.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java Consolidate append() related implementation into a single class --- Key: HDFS-8495 URL: https://issues.apache.org/jira/browse/HDFS-8495 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch This jira proposes to consolidate {{FSNamesystem#append()}} related methods into a single class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8800) hdfs --daemon stop namenode corrupts logs
[ https://issues.apache.org/jira/browse/HDFS-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636681#comment-14636681 ] Hudson commented on HDFS-8800: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) HDFS-8800. hdfs --daemon stop namenode corrupts logs (John Smith via aw) (aw: rev c9507fe6c12491f3aef5cd4142b4d466bd6b71c3) * hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hdfs --daemon stop namenode corrupts logs - Key: HDFS-8800 URL: https://issues.apache.org/jira/browse/HDFS-8800 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: John Smith Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-8800 namenode stop creates stack traces and extra gc logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8799: Attachment: HDFS-8799-HDFS-7285.01.patch If you restart a DN, you can't corrupt blocks any more. The indices of datanodes list in memory is mismatched with the indices of data dir in disk. It's a defect of MiniDFSCluster. That's why the patch includes many private util functions. Uploaded 01 patch. Please let me know if need more cases. Erasure Coding: add tests for namenode processing corrupt striped blocks Key: HDFS-8799 URL: https://issues.apache.org/jira/browse/HDFS-8799 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8799-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8799: Status: Patch Available (was: Open) Erasure Coding: add tests for namenode processing corrupt striped blocks Key: HDFS-8799 URL: https://issues.apache.org/jira/browse/HDFS-8799 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8799-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8805) Archival Storage: it is unable to get StoragePolicy of a directory
Hui Zheng created HDFS-8805: --- Summary: Archival Storage: it is unable to get StoragePolicy of a directory Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636703#comment-14636703 ] Hudson commented on HDFS-8795: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) HDFS-8795. Improve InvalidateBlocks#node2blocks. (yliu) (yliu: rev 4025326288c0167ff300d4f7ecc96f84ed141912) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636707#comment-14636707 ] Hudson commented on HDFS-8344: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) Revert HDFS-8344. NameNode doesn't recover lease for files with missing blocks (raviprak) (raviprak: rev 5137b388fc9d4d716f780daf6d04292feeb9df96) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8495) Consolidate append() related implementation into a single class
[ https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636705#comment-14636705 ] Hudson commented on HDFS-8495: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) HDFS-8495. Consolidate append() related implementation into a single class. Contributed by Rakesh R. (wheat9: rev 31f117138a00794de4951ee8433e304d72b04094) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirTruncateOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java Consolidate append() related implementation into a single class --- Key: HDFS-8495 URL: https://issues.apache.org/jira/browse/HDFS-8495 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch This jira proposes to consolidate {{FSNamesystem#append()}} related methods into a single class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
[ https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636697#comment-14636697 ] Hudson commented on HDFS-8773: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) HDFS-8773. Few FSNamesystem metrics are not documented in the Metrics page. Contributed by Rakesh R. (cnauroth: rev a26cc66f38daec2342215a66b599bf59cee1112c) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md Few FSNamesystem metrics are not documented in the Metrics page --- Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8773-00.patch This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones
[ https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636701#comment-14636701 ] Hudson commented on HDFS-8721: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) HDFS-8721. Add a metric for number of encryption zones. Contributed by Rakesh R. (cnauroth: rev cb03768b1b2250b9b5a7944cf6ef918e8a974e20) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java Add a metric for number of encryption zones --- Key: HDFS-8721 URL: https://issues.apache.org/jira/browse/HDFS-8721 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch Would be good to expose the number of encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8800) hdfs --daemon stop namenode corrupts logs
[ https://issues.apache.org/jira/browse/HDFS-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636708#comment-14636708 ] Hudson commented on HDFS-8800: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/]) HDFS-8800. hdfs --daemon stop namenode corrupts logs (John Smith via aw) (aw: rev c9507fe6c12491f3aef5cd4142b4d466bd6b71c3) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs hdfs --daemon stop namenode corrupts logs - Key: HDFS-8800 URL: https://issues.apache.org/jira/browse/HDFS-8800 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: John Smith Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-8800 namenode stop creates stack traces and extra gc logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8805) Archival Storage: it is unable to get StoragePolicy of a directory
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned HDFS-8805: -- Assignee: Brahma Reddy Battula Archival Storage: it is unable to get StoragePolicy of a directory -- Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover, namenode Reporter: Hui Zheng Assignee: Brahma Reddy Battula Fix For: 2.6.0 The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636685#comment-14636685 ] Gururaj Shetty commented on HDFS-8802: -- After verifying I found one more configuration 'dfs.namenode.decommission.nodes.per.interval' is missing in hdfs-default.xml dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8202) Improve end to end stirpping file test to add erasure recovering test
[ https://issues.apache.org/jira/browse/HDFS-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinwei Qin updated HDFS-8202: -- Attachment: HDFS-8202-HDFS-7285.004.patch Thanks [~zhz] for the comments. Update 004 patch to address these comments. bq. Can we write a loop instead of manually add all possibilities for testReadWithDNFailure and testReadCorruptedData? {{testReadCorruptedData}} could be modified to a loop as long as the corrupted files are different, but we can not do this for {{testReadWithDNFailure}}, since each test case of it needs a new cluster with no dead Datanode. Improve end to end stirpping file test to add erasure recovering test - Key: HDFS-8202 URL: https://issues.apache.org/jira/browse/HDFS-8202 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-8202-HDFS-7285.003.patch, HDFS-8202-HDFS-7285.004.patch, HDFS-8202.001.patch, HDFS-8202.002.patch This to follow on HDFS-8201 to add erasure recovering test in the end to end stripping file test: * After writing certain blocks to the test file, delete some block file; * Read the file content and compare, see if any recovering issue, or verify the erasure recovering works or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails
[ https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636721#comment-14636721 ] Xinwei Qin commented on HDFS-8704: --- Hi, [~libo-intel], even if the file is smaller than a block group(ie. {{filelength = blocksize * dataBlocks -123}}), when the index of failure DN is 0, the {{testDatanodeFailure0}} is also failed. error logs: {code} java.lang.AssertionError: org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): commitBlock length is less than the stored one 524165 vs. 1045504 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.commitBlock(BlockManager.java:635) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.commitOrCompleteLastBlock(BlockManager.java:665) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3672) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFileInternal(FSDirWriteFileOp.java:773) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFile(FSDirWriteFileOp.java:720) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3084) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:771) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:541) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) {code} Erasure Coding: client fails to write large file when one datanode fails Key: HDFS-8704 URL: https://issues.apache.org/jira/browse/HDFS-8704 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: HDFS-8704-000.patch I test current code on a 5-node cluster using RS(3,2). When a datanode is corrupt, client succeeds to write a file smaller than a block group but fails to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests files smaller than a block group, this jira will add more test situations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8735) Inotify : All events classes should implement toString() API.
[ https://issues.apache.org/jira/browse/HDFS-8735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636571#comment-14636571 ] Akira AJISAKA commented on HDFS-8735: - Thanks [~surendrasingh] for updating the patch. Three comments: 1. I still see some unnecessarily output in the test (modified to log the events). {{mtime=0, atime=0, replication=0}} is confusing to me. {noformat} 2015-07-22 17:49:45,769 [Thread-0] INFO hdfs.TestDFSInotifyEventInputStream (TestDFSInotifyEventInputStream.java:testBasic(289)) - MetadataUpdateEvent [path=/dir, metadataType=PERMS, mtime=0, atime=0, replication=0, perms=rw-rw-rw-, xAttrsRemoved=false] 2015-07-22 17:49:45,770 [Thread-0] INFO hdfs.TestDFSInotifyEventInputStream (TestDFSInotifyEventInputStream.java:testBasic(302)) - MetadataUpdateEvent [path=/dir, metadataType=OWNER, mtime=0, atime=0, replication=0, ownerName=username, groupName=groupname, xAttrsRemoved=false] {noformat} Can we switch by {{metadataType}} and output the necessarily information as follows? {code} @Override public String toString() { StringBuilder content = new StringBuilder(); content.append(MetadataUpdateEvent [path= + path + , metadataType= + metadataType); switch (metadataType) { case TIMES: content.append(, mtime= + mtime + , atime= + atime); break; case REPLICATION: content.append(, replication= + replication); break; ... } content.append(']'); return content.toString(); } {code} 2. I'm thinking logging the event in the test is good for debugging. {code} Assert.assertTrue(re.toString().contains(RenameEvent [srcPath=)); {code} 3. (minor) {{re.toString().startsWith}} is more strict. It should be used instead of {{String.contains}}. {code} Event.CloseEvent ce3 = (Event.CloseEvent) batch.getEvents()[2]; Assert.assertTrue(ce3.getPath().equals(/file2)); Assert.assertTrue(ce3.getTimestamp() 0); + Assert.assertTrue(ue2.toString().contains(UnlinkEvent [path=)); {code} 4. (minor) Would you move the assertion before defining {{ce3}}? --- I found some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} We should fix them in a separate jira. Inotify : All events classes should implement toString() API. - Key: HDFS-8735 URL: https://issues.apache.org/jira/browse/HDFS-8735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Attachments: HDFS-8735.01.patch, HDFS-8735.patch Event classes is used by client, it’s good to implement toString() API. {code} for(Event event : events){ System.out.println(event.toString()); } {code} This will give output like this {code} org.apache.hadoop.hdfs.inotify.Event$CreateEvent@6916d97d {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8735) Inotify : All events classes should implement toString() API.
[ https://issues.apache.org/jira/browse/HDFS-8735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636576#comment-14636576 ] Akira AJISAKA commented on HDFS-8735: - Thanks [~cmccabe] for the comment. bq. I would caution people not to start parsing the toString representation since it might change over time (if we add new optional fields to the Events). But it is nice for debugging and logging. Adding {{@Public}} and {{@Unstable}} to the subclasses is good for you? {{Event}} class is annotated, but the subclasses are not annotated and that's why they don't exist in API doc. Inotify : All events classes should implement toString() API. - Key: HDFS-8735 URL: https://issues.apache.org/jira/browse/HDFS-8735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Attachments: HDFS-8735.01.patch, HDFS-8735.patch Event classes is used by client, it’s good to implement toString() API. {code} for(Event event : events){ System.out.println(event.toString()); } {code} This will give output like this {code} org.apache.hadoop.hdfs.inotify.Event$CreateEvent@6916d97d {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636597#comment-14636597 ] Hadoop QA commented on HDFS-3570: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 25s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 39s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 1s | Site still builds. | | {color:green}+1{color} | checkstyle | 0m 51s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 23s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 32s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 27s | Tests failed in hadoop-hdfs. | | | | 208m 59s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746476/HDFS-3570.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 94c6a4a | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11782/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11782/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11782/console | This message was automatically generated. Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space Key: HDFS-3570 URL: https://issues.apache.org/jira/browse/HDFS-3570 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Akira AJISAKA Priority: Minor Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, HDFS-3570.aash.1.patch Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, post archived at http://pastebin.com/eVFkk0A0 This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very little DFS usage (which is computed against total possible capacity). Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8% only, its got a lot of free space to write more blocks, when that isn't true as shown by the case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply can't accept any more blocks as a result of its disks' state. I think it would be better if we _computed_ the actual utilization based on {{(100-(actual remaining space))/(capacity)}}, as opposed to the current {{(dfs used)/(capacity)}}. Thoughts? This isn't very critical, however, cause it is very rare to see DN space being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones
[ https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636674#comment-14636674 ] Hudson commented on HDFS-8721: -- FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/994/]) HDFS-8721. Add a metric for number of encryption zones. Contributed by Rakesh R. (cnauroth: rev cb03768b1b2250b9b5a7944cf6ef918e8a974e20) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EncryptionZoneManager.java Add a metric for number of encryption zones --- Key: HDFS-8721 URL: https://issues.apache.org/jira/browse/HDFS-8721 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch Would be good to expose the number of encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6682) Add a metric to expose the timestamp of the oldest under-replicated block
[ https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636750#comment-14636750 ] Hadoop QA commented on HDFS-6682: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 59s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 57s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 50s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 9s | Site still builds. | | {color:red}-1{color} | checkstyle | 2m 41s | The applied patch generated 4 new checkstyle issues (total was 457, now 458). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 24s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 41s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 8s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 160m 3s | Tests failed in hadoop-hdfs. | | | | 235m 52s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746488/HDFS-6682.005.patch | | Optional Tests | site javadoc javac unit findbugs checkstyle | | git revision | trunk / 4025326 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11783/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11783/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11783/console | This message was automatically generated. Add a metric to expose the timestamp of the oldest under-replicated block - Key: HDFS-6682 URL: https://issues.apache.org/jira/browse/HDFS-6682 Project: Hadoop HDFS Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: BB2015-05-TBR Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, HDFS-6682.004.patch, HDFS-6682.005.patch, HDFS-6682.patch In the following case, the data in the HDFS is lost and a client needs to put the same file again. # A Client puts a file to HDFS # A DataNode crashes before replicating a block of the file to other DataNodes I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block. That way client can know what file to retain for the re-try. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones
[ https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636799#comment-14636799 ] Rakesh R commented on HDFS-8721: Thanks [~wheat9] for the reviews. Thanks [~cnauroth] for reviewing and committing the changes! Add a metric for number of encryption zones --- Key: HDFS-8721 URL: https://issues.apache.org/jira/browse/HDFS-8721 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.8.0 Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch Would be good to expose the number of encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636914#comment-14636914 ] Tsuyoshi Ozawa commented on HDFS-8802: -- [~gururaj] Great. Do you mind creating a patch? dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gururaj Shetty updated HDFS-8802: - Status: Patch Available (was: Open) dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty Attachments: HDFS-8802.patch It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gururaj Shetty updated HDFS-8802: - Attachment: HDFS-8802.patch dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty Attachments: HDFS-8802.patch It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636932#comment-14636932 ] Gururaj Shetty commented on HDFS-8802: -- Hi [~ozawa] Attached the patch, kindly review and let me know for any changes. dfs.checksum.type is not described in hdfs-default.xml -- Key: HDFS-8802 URL: https://issues.apache.org/jira/browse/HDFS-8802 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.1 Reporter: Tsuyoshi Ozawa Assignee: Gururaj Shetty Attachments: HDFS-8802.patch It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8795) Improve InvalidateBlocks#node2blocks
[ https://issues.apache.org/jira/browse/HDFS-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636380#comment-14636380 ] Vinayakumar B commented on HDFS-8795: - +1 Improve InvalidateBlocks#node2blocks Key: HDFS-8795 URL: https://issues.apache.org/jira/browse/HDFS-8795 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8795.001.patch, HDFS-8795.002.patch Currently we use {{TreeMap}} for {{node2blocks}}, actually there are only two place may need sorted: {{dump}}, {{getDatanodes}}. But {{getDatanodes}} is called by {{computeInvalidateWork}}, and we do a shuffle there, so the sort is unnecssary. For {{dump}}, certainly we need to make few modification. So we can use {{HashMap}}. From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a simliar optimization HDFS-7433. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636868#comment-14636868 ] Hadoop QA commented on HDFS-8799: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 5m 34s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 25s | There were no new javac warning messages. | | {color:red}-1{color} | release audit | 0m 12s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 39s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 24s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 20s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 176m 19s | Tests failed in hadoop-hdfs. | | | | 197m 4s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746517/HDFS-8799-HDFS-7285.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | HDFS-7285 / 5956d23 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11784/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11784/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11784/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11784/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11784/console | This message was automatically generated. Erasure Coding: add tests for namenode processing corrupt striped blocks Key: HDFS-8799 URL: https://issues.apache.org/jira/browse/HDFS-8799 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8799-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636888#comment-14636888 ] Rakesh R commented on HDFS-8769: Thanks a lot [~walter.k.su] for the reviews. bq. You might have to hack BlockIdGenerator so that you could create a contiguous file with negative ID. I could see {{SequentialBlockIdGenerator}} legacy block will grow to positive integer and will throw exception on negative integer. Also, {{SequentialBlockGroupIdGenerator}} blockGrpId will grow to negative integer and will throw exception on positive integer. Since we have clear separation for the legacy blocks and EC block groups, do we need to hijack and create a contiguous file with negative ID. Instead I feel we could add tests to see the generator is validating the range and throws {{IllegalStateException}} properly in both the cases. Does this sound good to you? Erasure Coding: unit test for SequentialBlockGroupIdGenerator - Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Rakesh R Attachments: HDFS-8769-HDFS-7285-00.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread
[ https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637650#comment-14637650 ] Hadoop QA commented on HDFS-8797: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 33s | The applied patch generated 5 new checkstyle issues (total was 25, now 30). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 23s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 22s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 0s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 24s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 26s | Tests passed in hadoop-hdfs-client. | | | | 208m 57s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746589/HDFS-8797.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4025326 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11790/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11790/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11790/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11790/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11790/console | This message was automatically generated. WebHdfsFileSystem creates too many connections for pread Key: HDFS-8797 URL: https://issues.apache.org/jira/browse/HDFS-8797 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8797.000.patch, HDFS-8797.001.patch, HDFS-8797.002.patch, HDFS-8797.003.patch While running a test we found that WebHdfsFileSystem can create several thousand connections when doing a position read of a 200MB file. For each connection the client will connect to the DataNode again and the DataNode will create a new DFSClient instance to handle the read request. This also leads to several thousand {{getBlockLocations}} call to the NameNode. The cause of the issue is that in {{FSInputStream#read(long, byte[], int, int)}}, each time the inputstream reads some time, it seeks back to the old position and resets its state to SEEK. Thus the next read will regenerate the connection. {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { synchronized (this) { long oldPos = getPos(); int nread = -1; try { seek(position); nread = read(buffer, offset, length); } finally { seek(oldPos); } return nread; } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8807) dfs.datanode.data.dir does not handle spaces between storageType and URI correctly
[ https://issues.apache.org/jira/browse/HDFS-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-8807: --- Status: Patch Available (was: Open) dfs.datanode.data.dir does not handle spaces between storageType and URI correctly -- Key: HDFS-8807 URL: https://issues.apache.org/jira/browse/HDFS-8807 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: HDFS-8807.001.patch if you add a space between the storage type and file URI then datanodes fail during startup. Here is an example of mis-configration that leads to datanode failure. {code} property namedfs.datanode.data.dir/name value [DISK] file://tmp/hadoop-aengineer/disk1/dfs/data /value /property {code} Here is the fixed version. Please *note* the lack of space between \[DISK\] and file URI {code} property namedfs.datanode.data.dir/name value [DISK]file://tmp/hadoop-aengineer/disk1/dfs/data /value /property {code} we fail with a parsing error, here is the info from the datanode logs. {code} 2015-07-22 13:34:36,812 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in scheme name at index 0: file://tmp/hadoop-aengineer/disk1/dfs/data at org.apache.hadoop.fs.Path.initialize(Path.java:204) at org.apache.hadoop.fs.Path.init(Path.java:170) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.parse(StorageLocation.java:97) at org.apache.hadoop.hdfs.server.datanode.DataNode.getStorageLocations(DataNode.java:2314) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2349) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2529) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2553) Caused by: java.net.URISyntaxException: Illegal character in scheme name at index 0: file://tmp/hadoop-aengineer/disk1/dfs/data at java.net.URI$Parser.fail(URI.java:2829) at java.net.URI$Parser.checkChars(URI.java:3002) at java.net.URI$Parser.checkChar(URI.java:3012) at java.net.URI$Parser.parse(URI.java:3028) at java.net.URI.init(URI.java:753) at org.apache.hadoop.fs.Path.initialize(Path.java:201) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8785) TestDistributedFileSystem is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638038#comment-14638038 ] Hadoop QA commented on HDFS-8785: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 7m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 18s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 6s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 158m 43s | Tests passed in hadoop-hdfs. | | | | 180m 59s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746649/HDFS-8785.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 06e5dd2 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11793/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11793/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11793/console | This message was automatically generated. TestDistributedFileSystem is failing in trunk - Key: HDFS-8785 URL: https://issues.apache.org/jira/browse/HDFS-8785 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.8.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch A newly added test case {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in trunk. e.g. run https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638042#comment-14638042 ] Brahma Reddy Battula commented on HDFS-8808: Default value is zero. Hope You mean,when {{dfs.image.tranfer.bandwidthPerSec}} is configured some value? dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby Key: HDFS-8808 URL: https://issues.apache.org/jira/browse/HDFS-8808 Project: Hadoop HDFS Issue Type: Bug Reporter: Gautam Gopalakrishnan The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the speed with which the fsimage is copied between the namenodes during regular use. However, as a side effect, this also limits transfers when the {{-bootstrapStandby}} option is used. This option is often used during upgrades and could potentially slow down the entire workflow. The request here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8809) HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop 2.7.1 and HBase 1.1.1
[ https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated HDFS-8809: - Reporter: Sudhir Prakash (was: Matt Foley) HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop 2.7.1 and HBase 1.1.1 --- Key: HDFS-8809 URL: https://issues.apache.org/jira/browse/HDFS-8809 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.7.0 Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other Linuxes not tested, probably not platform-dependent). This did NOT happen with Hadoop 2.4 and HBase 0.98. Reporter: Sudhir Prakash Whenever HBase is running, the hdfs fsck / reports four hbase-related files in the path hbase/data/WALs/ as CORRUPT. Even after letting the cluster sit idle for a couple hours, it is still in the corrupt state. If HBase is shut down, the problem goes away. If HBase is then restarted, the problem recurs. {code} hades1:/var/opt/teradata/packages # su hdfs hdfs@hades1:/var/opt/teradata/packages hdfs fsck / Connecting to namenode via http://hades1.labs.teradata.com:50070/fsck?ugi=hdfspath=%2F FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 20:40:17 GMT 2015 ... /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301: MISSING 1 blocks of total size 83 B.. Status: CORRUPT Total size:723977553 B (Total open files size: 332 B) Total dirs:79 Total files: 388 Total symlinks:0 (Files currently being written: 5) Total blocks (validated): 387 (avg. block size 1870743 B) (Total open file blocks (not validated): 4) UNDER MIN REPL'D BLOCKS: 4 (1.0335917 %) dfs.namenode.replication.min: 1 CORRUPT FILES:4 MISSING BLOCKS: 4 MISSING SIZE: 332 B Minimally replicated blocks: 387 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 3.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds The filesystem under path '/' is CORRUPT hdfs@hades1:/var/opt/teradata/packages {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8735) Inotify : All events classes should implement toString() API.
[ https://issues.apache.org/jira/browse/HDFS-8735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638079#comment-14638079 ] Akira AJISAKA commented on HDFS-8735: - bq. I think the subclasses should have a \@Public annotation. I think all the functions are stable except for toString. Agree. Hi [~surendrasingh], would you add \@Public annotation to the subclasses and \@Unstable annotation to {{toString()}}? Inotify : All events classes should implement toString() API. - Key: HDFS-8735 URL: https://issues.apache.org/jira/browse/HDFS-8735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Attachments: HDFS-8735.01.patch, HDFS-8735.patch Event classes is used by client, it’s good to implement toString() API. {code} for(Event event : events){ System.out.println(event.toString()); } {code} This will give output like this {code} org.apache.hadoop.hdfs.inotify.Event$CreateEvent@6916d97d {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8807) dfs.datanode.data.dir does not handle spaces between storageType and URI correctly
[ https://issues.apache.org/jira/browse/HDFS-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638048#comment-14638048 ] Hadoop QA commented on HDFS-8807: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 48s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 46s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 21s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 36s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 9s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 32s | Tests failed in hadoop-hdfs. | | | | 208m 19s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746654/HDFS-8807.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 06e5dd2 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11792/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11792/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11792/console | This message was automatically generated. dfs.datanode.data.dir does not handle spaces between storageType and URI correctly -- Key: HDFS-8807 URL: https://issues.apache.org/jira/browse/HDFS-8807 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: HDFS-8807.001.patch if you add a space between the storage type and file URI then datanodes fail during startup. Here is an example of mis-configration that leads to datanode failure. {code} property namedfs.datanode.data.dir/name value [DISK] file://tmp/hadoop-aengineer/disk1/dfs/data /value /property {code} Here is the fixed version. Please *note* the lack of space between \[DISK\] and file URI {code} property namedfs.datanode.data.dir/name value [DISK]file://tmp/hadoop-aengineer/disk1/dfs/data /value /property {code} we fail with a parsing error, here is the info from the datanode logs. {code} 2015-07-22 13:34:36,812 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in scheme name at index 0: file://tmp/hadoop-aengineer/disk1/dfs/data at org.apache.hadoop.fs.Path.initialize(Path.java:204) at org.apache.hadoop.fs.Path.init(Path.java:170) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.parse(StorageLocation.java:97) at org.apache.hadoop.hdfs.server.datanode.DataNode.getStorageLocations(DataNode.java:2314) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2349) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2529) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2553) Caused by: java.net.URISyntaxException: Illegal character in scheme name at index 0: file://tmp/hadoop-aengineer/disk1/dfs/data at java.net.URI$Parser.fail(URI.java:2829) at
[jira] [Created] (HDFS-8809) HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop 2.7.1 and HBase 1.1.1
Matt Foley created HDFS-8809: Summary: HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop 2.7.1 and HBase 1.1.1 Key: HDFS-8809 URL: https://issues.apache.org/jira/browse/HDFS-8809 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.7.0 Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other Linuxes not tested, probably not platform-dependent). This did NOT happen with Hadoop 2.4 and HBase 0.98. Reporter: Matt Foley Whenever HBase is running, the hdfs fsck / reports four hbase-related files in the path hbase/data/WALs/ as CORRUPT. Even after letting the cluster sit idle for a couple hours, it is still in the corrupt state. If HBase is shut down, the problem goes away. If HBase is then restarted, the problem recurs. {code} hades1:/var/opt/teradata/packages # su hdfs hdfs@hades1:/var/opt/teradata/packages hdfs fsck / Connecting to namenode via http://hades1.labs.teradata.com:50070/fsck?ugi=hdfspath=%2F FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 20:40:17 GMT 2015 ... /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500: MISSING 1 blocks of total size 83 B. /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301: MISSING 1 blocks of total size 83 B.. Status: CORRUPT Total size:723977553 B (Total open files size: 332 B) Total dirs:79 Total files: 388 Total symlinks:0 (Files currently being written: 5) Total blocks (validated): 387 (avg. block size 1870743 B) (Total open file blocks (not validated): 4) UNDER MIN REPL'D BLOCKS: 4 (1.0335917 %) dfs.namenode.replication.min: 1 CORRUPT FILES:4 MISSING BLOCKS: 4 MISSING SIZE: 332 B Minimally replicated blocks: 387 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 3.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds The filesystem under path '/' is CORRUPT hdfs@hades1:/var/opt/teradata/packages {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8663) sys cpu usage high on namenode server
[ https://issues.apache.org/jira/browse/HDFS-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangjunjie updated HDFS-8663: - Component/s: fs sys cpu usage high on namenode server - Key: HDFS-8663 URL: https://issues.apache.org/jira/browse/HDFS-8663 Project: Hadoop HDFS Issue Type: Bug Components: fs, namenode Affects Versions: 2.3.0 Environment: hadoop 2.3.0 centos5.8 Reporter: tangjunjie sys cpu usage high on namenode server lead to run job very slow. I use ps -elf see many zombie process. I check hdfs log I found many exceptions like: org.apache.hadoop.util.Shell$ExitCodeException: id: sem_410: No such user at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.util.Shell.execCommand(Shell.java:739) at org.apache.hadoop.util.Shell.execCommand(Shell.java:722) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52) at org.apache.hadoop.security.Groups.getGroups(Groups.java:139) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1409) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.init(FSPermissionChecker.java:81) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3310) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3491) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:764) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:764) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) Then I create all user such as sem_410 appear in exception.Then the sys cpu usage on namenode down. BTW, my hadoop 2.3.0 enaable hadoop acl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638053#comment-14638053 ] Gautam Gopalakrishnan commented on HDFS-8808: - That's right, when the setting is at a non-default value. In one situation it was set to 600KB/s which was fine for normal checkpointing but was slowing down the {{-bootstrapStandby}} dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby Key: HDFS-8808 URL: https://issues.apache.org/jira/browse/HDFS-8808 Project: Hadoop HDFS Issue Type: Bug Reporter: Gautam Gopalakrishnan The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the speed with which the fsimage is copied between the namenodes during regular use. However, as a side effect, this also limits transfers when the {{-bootstrapStandby}} option is used. This option is often used during upgrades and could potentially slow down the entire workflow. The request here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8805) Archival Storage: getStoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-8805: Summary: Archival Storage: getStoragePolicy (was: Archival Storage: it is unable to get StoragePolicy of a directory) Archival Storage: getStoragePolicy -- Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover, namenode Reporter: Hui Zheng Assignee: Brahma Reddy Battula Fix For: 2.6.0 The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638056#comment-14638056 ] Hui Zheng commented on HDFS-8805: - Hi [~jingzhao] I tried it by super user then found it works well. So this issue should become getStoragePolicy should not need superuser privilege. Archival Storage: getStoragePolicy should not need superuser privilege -- Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover, namenode Reporter: Hui Zheng Assignee: Brahma Reddy Battula Fix For: 2.6.0 The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-8805: Summary: Archival Storage: getStoragePolicy should not need superuser privilege (was: Archival Storage: getStoragePolicy) Archival Storage: getStoragePolicy should not need superuser privilege -- Key: HDFS-8805 URL: https://issues.apache.org/jira/browse/HDFS-8805 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover, namenode Reporter: Hui Zheng Assignee: Brahma Reddy Battula Fix For: 2.6.0 The result of getStoragePolicy command is always 'unspecified' even we has set a StoragePolicy on a directory.But the real placement of blocks is correct. The result of fsck is not correct either. {code} $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD Set storage policy COLD on /tmp/cold $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold The storage policy of /tmp/cold is unspecified $ hdfs fsck -storagepolicies /tmp/cold Blocks NOT satisfying the specified storage policy: Storage Policy Specified Storage Policy # of blocks % of blocks ARCHIVE:4(COLD) HOT 5 55.5556% ARCHIVE:3(COLD) HOT 4 44.% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6682) Add a metric to expose the timestamp of the oldest under-replicated block
[ https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6682: Attachment: HDFS-6682.006.patch 006 patch * Fixes checkstyle warnings. Add a metric to expose the timestamp of the oldest under-replicated block - Key: HDFS-6682 URL: https://issues.apache.org/jira/browse/HDFS-6682 Project: Hadoop HDFS Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: BB2015-05-TBR Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, HDFS-6682.004.patch, HDFS-6682.005.patch, HDFS-6682.006.patch, HDFS-6682.patch In the following case, the data in the HDFS is lost and a client needs to put the same file again. # A Client puts a file to HDFS # A DataNode crashes before replicating a block of the file to other DataNodes I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block. That way client can know what file to retain for the re-try. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: HDFS-6407.6.patch new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: HDFS-6407.7.patch new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-8810: - Summary: Correct assertions in TestDFSInotifyEventInputStream class. (was: Correct assertions for TestDFSInotifyEventInputStream.) Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638108#comment-14638108 ] Hadoop QA commented on HDFS-6407: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 18s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | | | 0m 22s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746713/HDFS-6407.7.patch | | Optional Tests | | | git revision | trunk / ee98d63 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11799/artifact/patchprocess/whitespace.txt | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11799/console | This message was automatically generated. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638110#comment-14638110 ] Chang Li commented on HDFS-6407: please see sorting 2.png for latest effect new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: sorting 2.png new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)