[jira] [Updated] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinwei Qin updated HDFS-7859: -- Attachment: HDFS-7859-HDFS-7285.003.patch Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-8137 started by Uma Maheswara Rao G. - Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
[ https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved HDFS-8183. - Resolution: Fixed Fix Version/s: HDFS-7285 Hadoop Flags: Reviewed The patch LGTM. +1 and I just committed it to the branch (since the change is simple we can probably watch Jenkins later). Thanks Rakesh for the contribution! Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads -- Key: HDFS-8183 URL: https://issues.apache.org/jira/browse/HDFS-8183 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Fix For: HDFS-7285 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch The idea of this task is to improve closing of all the streamers. Presently if any of the streamer throws an exception, it will returning immediately. This leaves all the other streamer threads running. Instead its good to handle the exceptions of each streamer independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-7949: --- Status: Patch Available (was: In Progress) WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero
[ https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521152#comment-14521152 ] Hadoop QA commented on HDFS-8276: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | javac | 7m 29s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 7m 43s | The applied patch generated 1 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 7s | The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 224m 19s | Tests failed in hadoop-hdfs. | | | | 272m 48s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | Failed unit tests | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.TestClose | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.namenode.TestDeleteRace | | Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol | | | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache | | | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729099/HDFS-8276.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / aa22450 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/checkstyle-result-diff.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10468/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10468/console | This message was automatically generated. LazyPersistFileScrubber should be disabled if scrubber interval configured zero --- Key: HDFS-8276 URL: https://issues.apache.org/jira/browse/HDFS-8276 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8276.patch bq. but I think it is simple enough to change the meaning of the value so that zero means 'never scrub'. Let me post an updated patch. As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], scrubber should be disable if *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero. Currently namenode startup is failing if interval configured zero {code} 2015-04-27 23:47:31,744 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.lang.IllegalArgumentException:
[jira] [Updated] (HDFS-8178) QJM doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8178: Description: When a QJM crashes, the in-progress edit log file at that time remains in the file system. When the node comes back, it will accept new edit logs and those stale in-progress files are never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize them, which potentially causes high memory usage. This JIRA aims to move aside those stale edit log files to avoid this scenario. (was: HDFS-5919 fixes the issue for {{FileJournalManager}}. A similar fix is needed for QJM.) QJM doesn't purge empty and corrupt inprogress edits files -- Key: HDFS-8178 URL: https://issues.apache.org/jira/browse/HDFS-8178 Project: Hadoop HDFS Issue Type: Bug Components: qjm Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8178.000.patch When a QJM crashes, the in-progress edit log file at that time remains in the file system. When the node comes back, it will accept new edit logs and those stale in-progress files are never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize them, which potentially causes high memory usage. This JIRA aims to move aside those stale edit log files to avoid this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521005#comment-14521005 ] Zhe Zhang commented on HDFS-8178: - Oops last paragraph was added by mistake, please ignore it. QJM doesn't move aside stale inprogress edits files --- Key: HDFS-8178 URL: https://issues.apache.org/jira/browse/HDFS-8178 Project: Hadoop HDFS Issue Type: Bug Components: qjm Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8178.000.patch When a QJM crashes, the in-progress edit log file at that time remains in the file system. When the node comes back, it will accept new edit logs and those stale in-progress files are never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize them, which potentially causes high memory usage. This JIRA aims to move aside those stale edit log files to avoid this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521002#comment-14521002 ] Zhe Zhang commented on HDFS-8178: - Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we are actually trying to solve a different problem here. The objective of HDFS-5919 is sorely to save disk space (since FJM doesn't try to process those corrupt/empty files anyway). It's a safe cleanup, making sure the tx ID of empty / corrupt files are old enough before purging. So I think we should do the same in QJM. Our main target here is _stale_ in-progress edit log files, which are not necessarily empty/corrupt (so they won't be mark as so). As the updated description states, we want to properly take care of those files so QJM doesn't try to process them. I like your proposal of rename / move aside those files and remove them when they are older than {{minTxIdToKeep}}. I'll update the patch based on this idea. I also propose we do the same for corrupt / empty files, for both FJM and QJM. QJM doesn't move aside stale inprogress edits files --- Key: HDFS-8178 URL: https://issues.apache.org/jira/browse/HDFS-8178 Project: Hadoop HDFS Issue Type: Bug Components: qjm Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8178.000.patch When a QJM crashes, the in-progress edit log file at that time remains in the file system. When the node comes back, it will accept new edit logs and those stale in-progress files are never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize them, which potentially causes high memory usage. This JIRA aims to move aside those stale edit log files to avoid this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8294: --- Attachment: HDFS-8294-HDFS-7285.00.patch Erasure Coding: Fix Findbug warnings present in erasure coding -- Key: HDFS-8294 URL: https://issues.apache.org/jira/browse/HDFS-8294 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8294-HDFS-7285.00.patch Following are the findbug warnings :- # Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) {code} Bug type NP_NULL_ON_SOME_PATH (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Value loaded from arr$ Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] Known null at BlockInfoStripedUnderConstruction.java:[line 200] {code} # Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) {code} Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema) Called method String.getBytes() At ErasureCodingZoneManager.java:[line 116] Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath) Called method new String(byte[]) At ErasureCodingZoneManager.java:[line 81] {code} # Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time {code} Bug type IS2_INCONSISTENT_SYNC (click for details) In class org.apache.hadoop.hdfs.DFSOutputStream Field org.apache.hadoop.hdfs.DFSOutputStream.streamer Synchronized 90% of the time Unsynchronized access at DFSOutputStream.java:[line 142] Unsynchronized access at DFSOutputStream.java:[line 853] Unsynchronized access at DFSOutputStream.java:[line 617] Unsynchronized access at DFSOutputStream.java:[line 620] Unsynchronized access at DFSOutputStream.java:[line 630] Unsynchronized access at DFSOutputStream.java:[line 338] Unsynchronized access at DFSOutputStream.java:[line 734] Unsynchronized access at DFSOutputStream.java:[line 897] {code} # Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() {code} Bug type DLS_DEAD_LOCAL_STORE (click for details) In class org.apache.hadoop.hdfs.StripedDataStreamer In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() Local variable named offSuccess At StripedDataStreamer.java:[line 105] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.util.StripedBlockUtil In method org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] {code} # Switch statement found in org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, long, byte[], int, Map) where default case is missing {code} Bug type SF_SWITCH_NO_DEFAULT (click for details) In class org.apache.hadoop.hdfs.DFSStripedInputStream In method org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, long, byte[], int, Map) At DFSStripedInputStream.java:[lines 468-491] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement
[ https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520986#comment-14520986 ] Chris Nauroth commented on HDFS-8283: - These test failures might be related too: https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/ DataStreamer cleanup and some minor improvement --- Key: HDFS-8283 URL: https://issues.apache.org/jira/browse/HDFS-8283 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.8.0 Attachments: h8283_20150428.patch - When throwing an exception -* always set lastException -* always creating a new exception so that it has the new stack trace - Add LOG. - Add final to isAppend and favoredNodes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.
[ https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520990#comment-14520990 ] Chris Nauroth commented on HDFS-8290: - The Findbugs warning is in an unrelated part of the codebase. It's possible that both the Findbugs warning and the test failures were introduced by HDFS-8283. I'm waiting for confirmation before I commit this. WebHDFS calls before namesystem initialization can cause NullPointerException. -- Key: HDFS-8290 URL: https://issues.apache.org/jira/browse/HDFS-8290 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HDFS-8290.001.patch The NameNode has a brief window of time when the HTTP server has been initialized, but the namesystem has not been initialized. During this window, a WebHDFS call can cause a {{NullPointerException}}. We can catch this condition and return a more meaningful error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521014#comment-14521014 ] Zhe Zhang commented on HDFS-7949: - Thanks Rakesh,! The patch LGTM, +1 pending a Jenkins run. Do you mind Submit Patch and rename the patch as HDFS-7949-HDFS-7285.007.patch? WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Priority: Minor Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-8137: -- Attachment: HDFS-8137-0.patch I generated an initial patch for review! We supposed to get schema values from ECSchemaManager, but right now I don't see a better way to get from ECScheaManeger, so I added an API to get from BlockCollection itself like isStriped API in it. It's because BlockManager communicates with namesystem via Namesystem interface. I don't think its right to add apis there for every new features. BlockCollection is another interface like that and I added there. But logically Namesystem may be correct place to add getECSchema for a file path . But I am not too strong on that. I would like hear the suggestion on that if any. Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520994#comment-14520994 ] Zhe Zhang commented on HDFS-8282: - Thanks Yi for reviewing again! I just committed it to the branch. Erasure coding: move striped reading logic to StripedBlockUtil -- Key: HDFS-8282 URL: https://issues.apache.org/jira/browse/HDFS-8282 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8282-HDFS-7285.00.patch, HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8282: Resolution: Fixed Fix Version/s: HDFS-7285 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) [~hitliuyi] We need to rebase both HDFS-7678 and HDFS-7348 against this change. Erasure coding: move striped reading logic to StripedBlockUtil -- Key: HDFS-8282 URL: https://issues.apache.org/jira/browse/HDFS-8282 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: HDFS-7285 Attachments: HDFS-8282-HDFS-7285.00.patch, HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
[ https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521053#comment-14521053 ] Rakesh R commented on HDFS-8183: Thank you [~zhz] for reviewing and committing the changes. Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads -- Key: HDFS-8183 URL: https://issues.apache.org/jira/browse/HDFS-8183 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Fix For: HDFS-7285 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch The idea of this task is to improve closing of all the streamers. Presently if any of the streamer throws an exception, it will returning immediately. This leaves all the other streamer threads running. Instead its good to handle the exceptions of each streamer independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Open (was: Patch Available) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.7.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Attachment: (was: HDFS-8078.6.patch) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.7.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Patch Available (was: Open) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.7.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8178) QJM doesn't move aside stale inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8178: Summary: QJM doesn't move aside stale inprogress edits files (was: QJM doesn't purge empty and corrupt inprogress edits files) QJM doesn't move aside stale inprogress edits files --- Key: HDFS-8178 URL: https://issues.apache.org/jira/browse/HDFS-8178 Project: Hadoop HDFS Issue Type: Bug Components: qjm Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8178.000.patch When a QJM crashes, the in-progress edit log file at that time remains in the file system. When the node comes back, it will accept new edit logs and those stale in-progress files are never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize them, which potentially causes high memory usage. This JIRA aims to move aside those stale edit log files to avoid this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Status: Patch Available (was: In Progress) Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7348) Erasure Coding: striped block recovery
[ https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026 ] Yi Liu edited comment on HDFS-7348 at 4/30/15 7:06 AM: --- Thanks Zhe and Bo for further discussion. I will rebase the patch, and make buffer size configurable and add decode part as Zhe's suggestion. For sequential vs. parallel reading, I will file a follow-on and target in phase2. For local read (if the source is local) and local write (if the target is local), you guys can do them as follow-on in your JIRAs and target to Phase2. was (Author: hitliuyi): Thanks Zhe and Bo for further discussion. I will rebase the patch, and make buffer size configurable and add decode part as Zhe's suggestion. For sequential vs. parallel reading, I will file a follow-on and target in phase2. For local read and local write, you guys can do them as follow-on in your JIRAs and target to Phase2. Erasure Coding: striped block recovery -- Key: HDFS-7348 URL: https://issues.apache.org/jira/browse/HDFS-7348 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Kai Zheng Assignee: Yi Liu Attachments: ECWorker.java, HDFS-7348.001.patch This JIRA is to recover one or more missed striped block in the striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations
[ https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinwei Qin updated HDFS-8295: -- Issue Type: Sub-task (was: Task) Parent: HDFS-8031 Add MODIFY and REMOVE ECSchema editlog operations - Key: HDFS-8295 URL: https://issues.apache.org/jira/browse/HDFS-8295 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Xinwei Qin If MODIFY and REMOVE ECSchema operations are supported, then add these editlog operations to persist them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery
[ https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026 ] Yi Liu commented on HDFS-7348: -- Thanks Zhe and Bo for further discussion. I will rebase the patch, and make buffer size configurable and add decode part as Zhe's suggestion. For sequential vs. parallel reading, I will file a follow-on and target in phase2. For local read and local write, you guys can do them as follow-on in your JIRAs and target to Phase2. Erasure Coding: striped block recovery -- Key: HDFS-7348 URL: https://issues.apache.org/jira/browse/HDFS-7348 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Kai Zheng Assignee: Yi Liu Attachments: ECWorker.java, HDFS-7348.001.patch This JIRA is to recover one or more missed striped block in the striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-7949: --- Attachment: HDFS-7949-HDFS-7285.08.patch WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Priority: Minor Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8229: - Attachment: HDFS-8229_2.patch LAZY_PERSIST file gets deleted after NameNode restart. -- Key: HDFS-8229 URL: https://issues.apache.org/jira/browse/HDFS-8229 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch {code} 2015-04-20 10:26:55,180 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist file /LAZY_PERSIST/smallfile with no replicas. {code} After namenode restart and before DN's registration if {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520989#comment-14520989 ] Yi Liu commented on HDFS-8282: -- yes, +1 Erasure coding: move striped reading logic to StripedBlockUtil -- Key: HDFS-8282 URL: https://issues.apache.org/jira/browse/HDFS-8282 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8282-HDFS-7285.00.patch, HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8294: --- Status: Patch Available (was: Open) Erasure Coding: Fix Findbug warnings present in erasure coding -- Key: HDFS-8294 URL: https://issues.apache.org/jira/browse/HDFS-8294 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8294-HDFS-7285.00.patch Following are the findbug warnings :- # Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) {code} Bug type NP_NULL_ON_SOME_PATH (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Value loaded from arr$ Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] Known null at BlockInfoStripedUnderConstruction.java:[line 200] {code} # Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) {code} Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema) Called method String.getBytes() At ErasureCodingZoneManager.java:[line 116] Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath) Called method new String(byte[]) At ErasureCodingZoneManager.java:[line 81] {code} # Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time {code} Bug type IS2_INCONSISTENT_SYNC (click for details) In class org.apache.hadoop.hdfs.DFSOutputStream Field org.apache.hadoop.hdfs.DFSOutputStream.streamer Synchronized 90% of the time Unsynchronized access at DFSOutputStream.java:[line 142] Unsynchronized access at DFSOutputStream.java:[line 853] Unsynchronized access at DFSOutputStream.java:[line 617] Unsynchronized access at DFSOutputStream.java:[line 620] Unsynchronized access at DFSOutputStream.java:[line 630] Unsynchronized access at DFSOutputStream.java:[line 338] Unsynchronized access at DFSOutputStream.java:[line 734] Unsynchronized access at DFSOutputStream.java:[line 897] {code} # Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() {code} Bug type DLS_DEAD_LOCAL_STORE (click for details) In class org.apache.hadoop.hdfs.StripedDataStreamer In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() Local variable named offSuccess At StripedDataStreamer.java:[line 105] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.util.StripedBlockUtil In method org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] {code} # Switch statement found in org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, long, byte[], int, Map) where default case is missing {code} Bug type SF_SWITCH_NO_DEFAULT (click for details) In class org.apache.hadoop.hdfs.DFSStripedInputStream In method org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, long, byte[], int, Map) At DFSStripedInputStream.java:[lines 468-491] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521029#comment-14521029 ] Xinwei Qin commented on HDFS-7859: --- The 003 patch removes MODIFY and REMOVE ECSchema editlog operations, these operations will be added by another JIRA(HDFS-8295) later when they are supported. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, HDFS-7859.001.patch, HDFS-7859.002.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-7949: --- Priority: Major (was: Minor) Target Version/s: HDFS-7285 WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations
Xinwei Qin created HDFS-8295: - Summary: Add MODIFY and REMOVE ECSchema editlog operations Key: HDFS-8295 URL: https://issues.apache.org/jira/browse/HDFS-8295 Project: Hadoop HDFS Issue Type: Task Reporter: Xinwei Qin Assignee: Xinwei Qin If MODIFY and REMOVE ECSchema operations are supported, then add these editlog operations to persist them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521047#comment-14521047 ] Hadoop QA commented on HDFS-8294: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729419/HDFS-8294-HDFS-7285.00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 5a83838 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10472/console | This message was automatically generated. Erasure Coding: Fix Findbug warnings present in erasure coding -- Key: HDFS-8294 URL: https://issues.apache.org/jira/browse/HDFS-8294 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8294-HDFS-7285.00.patch Following are the findbug warnings :- # Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) {code} Bug type NP_NULL_ON_SOME_PATH (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Value loaded from arr$ Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] Known null at BlockInfoStripedUnderConstruction.java:[line 200] {code} # Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) {code} Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema) Called method String.getBytes() At ErasureCodingZoneManager.java:[line 116] Bug type DM_DEFAULT_ENCODING (click for details) In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager In method org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath) Called method new String(byte[]) At ErasureCodingZoneManager.java:[line 81] {code} # Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time {code} Bug type IS2_INCONSISTENT_SYNC (click for details) In class org.apache.hadoop.hdfs.DFSOutputStream Field org.apache.hadoop.hdfs.DFSOutputStream.streamer Synchronized 90% of the time Unsynchronized access at DFSOutputStream.java:[line 142] Unsynchronized access at DFSOutputStream.java:[line 853] Unsynchronized access at DFSOutputStream.java:[line 617] Unsynchronized access at DFSOutputStream.java:[line 620] Unsynchronized access at DFSOutputStream.java:[line 630] Unsynchronized access at DFSOutputStream.java:[line 338] Unsynchronized access at DFSOutputStream.java:[line 734] Unsynchronized access at DFSOutputStream.java:[line 897] {code} # Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() {code} Bug type DLS_DEAD_LOCAL_STORE (click for details) In class org.apache.hadoop.hdfs.StripedDataStreamer In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() Local variable named offSuccess At StripedDataStreamer.java:[line 105] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped In method org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] {code} # Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) {code} Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) In class org.apache.hadoop.hdfs.util.StripedBlockUtil In method org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] {code} # Switch statement found in org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, long, byte[], int, Map) where default case is
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Attachment: HDFS-8242-HDFS-7285.04.patch Attached previous patch again to see the jenkins report Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521123#comment-14521123 ] Kai Zheng commented on HDFS-8137: - Uma thanks for the patch and good comments. I'd like to look at this and give my thoughts later today. Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7897) Shutdown metrics when stopping JournalNode
[ https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521202#comment-14521202 ] zhouyingchao commented on HDFS-7897: Any updates regarding this simple patch? Shutdown metrics when stopping JournalNode -- Key: HDFS-7897 URL: https://issues.apache.org/jira/browse/HDFS-7897 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: zhouyingchao Assignee: zhouyingchao Attachments: HDFS-7897-001.patch In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue is found when reading the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery
[ https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520951#comment-14520951 ] Li Bo commented on HDFS-7348: - bq. We can do local writing and local reading logics as follow-on under HDFS-8031. Agree. We can do optimization of write and read logics later. Erasure Coding: striped block recovery -- Key: HDFS-7348 URL: https://issues.apache.org/jira/browse/HDFS-7348 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Kai Zheng Assignee: Yi Liu Attachments: ECWorker.java, HDFS-7348.001.patch This JIRA is to recover one or more missed striped block in the striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521004#comment-14521004 ] Akira AJISAKA commented on HDFS-5574: - Looks like jenkins ran the tests in hadoop-hdfs project with hadoop-common-3.0.0-date.jar, which does not have {{FSInputChecker#readAndDiscard}}. I could reproduce the error by the following command: {code} $ cd hadoop-hdfs-project/hadoop-hdfs $ mvn test -Dtest=TestDFSInputStream {code} Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8296) BlockManager.getUnderReplicatedBlocksCount() is not giving correct count if namenode in safe mode.
surendra singh lilhore created HDFS-8296: Summary: BlockManager.getUnderReplicatedBlocksCount() is not giving correct count if namenode in safe mode. Key: HDFS-8296 URL: https://issues.apache.org/jira/browse/HDFS-8296 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore {{underReplicatedBlocksCount}} update by the {{updateState()}} API. {code} void updateState() { pendingReplicationBlocksCount = pendingReplications.size(); underReplicatedBlocksCount = neededReplications.size(); corruptReplicaBlocksCount = corruptReplicas.size(); } {code} but this will not call when NN in safe mode. This is happening because computeDatanodeWork() we will return 0 if NN in safe mode {code} int computeDatanodeWork() { . if (namesystem.isInSafeMode()) { return 0; } this.updateState(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8161) Both Namenodes are in standby State
[ https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521064#comment-14521064 ] Brahma Reddy Battula commented on HDFS-8161: [~vinayrpet] , [~jnp] and [~arpitagarwal] any thoughts on this..? As there is no checksum verification from the ZK Side and seems to be no one interested in checksum feature in ZK side (since I did not seen any comment in ZOOKEEPER-2175 ),Can we have some mechanism here..? Both Namenodes are in standby State --- Key: HDFS-8161 URL: https://issues.apache.org/jira/browse/HDFS-8161 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: ACTIVEBreadcumb and StandbyElector.txt Suspected Scenario: Start cluster with three Nodes. Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open session with this ZK ) Now ZKFC ( Active NN's ) session expire and try re-establish connection with another ZK...Bythe time ZKFC ( StndBy NN's ) will try to fence old active and create the active Breadcrumb and Makes SNN to active state.. But immediately it fence to standby state.. ( Here is the doubt) Hence both will be in standby state.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations
[ https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinwei Qin updated HDFS-8295: -- Attachment: HDFS-8295.001.patch A initial patch based on HDFS-7859. Add MODIFY and REMOVE ECSchema editlog operations - Key: HDFS-8295 URL: https://issues.apache.org/jira/browse/HDFS-8295 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xinwei Qin Assignee: Xinwei Qin Attachments: HDFS-8295.001.patch If MODIFY and REMOVE ECSchema operations are supported, then add these editlog operations to persist them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8229: - Status: Patch Available (was: Open) LAZY_PERSIST file gets deleted after NameNode restart. -- Key: HDFS-8229 URL: https://issues.apache.org/jira/browse/HDFS-8229 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch {code} 2015-04-20 10:26:55,180 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist file /LAZY_PERSIST/smallfile with no replicas. {code} After namenode restart and before DN's registration if {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521145#comment-14521145 ] surendra singh lilhore commented on HDFS-8229: -- Attached new patch, please review. In test case I am not using {{getCorruptReplicaBlocksCount()}} API for counting corrupt blocks because of [HDFS-8296|https://issues.apache.org/jira/browse/HDFS-8296] LAZY_PERSIST file gets deleted after NameNode restart. -- Key: HDFS-8229 URL: https://issues.apache.org/jira/browse/HDFS-8229 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch {code} 2015-04-20 10:26:55,180 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist file /LAZY_PERSIST/smallfile with no replicas. {code} After namenode restart and before DN's registration if {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-7770: Resolution: Fixed Fix Version/s: 2.7.1 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed this to trunk, branch-2, and branch-2.7. Thanks [~xyao] for the contribution. Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8297) Ability to online trigger data dir rescan for blocks
Hari Sekhon created HDFS-8297: - Summary: Ability to online trigger data dir rescan for blocks Key: HDFS-8297 URL: https://issues.apache.org/jira/browse/HDFS-8297 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Affects Versions: 2.6.0 Environment: HDP 2.2 Reporter: Hari Sekhon Feature request to add functionality to online trigger data dir rescan for available blocks without having to restart datanode. Motivation is if using HDFS storage tiering with an archive tier to a separate hyperscale storage device over the network (Hedvig in this case) which may go away and then return due to say a network interruption or other temporary error, this leaves HDFS fsck declaring missing blocks, that are clearly visible on the mount point for the node's archive directory. An online trigger for data dir rescsan for available blocks would avoid having to do a rolling restart of all datanodes across a cluster. I did try sending a kill -HUP to the datanode process (both SecureDataNodeStarter parent and child) while tailing the log hoping this might do it, but nothing happened in the log. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521253#comment-14521253 ] Akira AJISAKA commented on HDFS-5574: - +1, I ran the failed tests locally and all the tests passed. Committing this. Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521306#comment-14521306 ] Hadoop QA commented on HDFS-8242: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 43s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 58s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 5m 41s | There were no new checkstyle issues. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 13s | The patch appears to introduce 9 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 88m 18s | Tests failed in hadoop-hdfs. | | | | 135m 14s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time Unsynchronized access at DFSOutputStream.java:90% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 105] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 116] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:[line 167] | | Failed unit tests | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestClose | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.TestMultiThreadedHflush | | Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol | | | org.apache.hadoop.hdfs.TestSetrepIncreasing | | | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729426/HDFS-8242-HDFS-7285.04.patch | | Optional Tests
[jira] [Commented] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521322#comment-14521322 ] Takanobu Asanuma commented on HDFS-7687: Thank you for the information. Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Takanobu Asanuma We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521323#comment-14521323 ] Akira AJISAKA commented on HDFS-7770: - Thanks [~xyao] for updating the patch. LGTM, +1. bq. I think we can address that in a separate JIRA. Agree, let's create a jira for this. Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-5574: Resolution: Fixed Fix Version/s: 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed this to trunk and branch-2. Thanks [~decster] for the contribution! Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-7687: --- Attachment: HDFS-7687.1.patch I created an initial patch. The main codes in this patch are below. # I separated {{collect\[File|Block\]Summary}} into {{collectReplicated\[File|Block\]Summary}} and {{collectEC\[File|Block\]Summary}}. # I named or renamed some variables and outputs. For example, {{ReplicatedBlocks}} is {{ECBlockGroups}} in EC. And {{Replication}} or {{Replicas}} is {{ECBlocks}} in EC. # I added EC summaries to Result#toString. Please will you review this patch? I'm going to add some tests about this codes. Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Takanobu Asanuma Attachments: HDFS-7687.1.patch We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7578) NFS WRITE and COMMIT responses should always use the channel pipeline
[ https://issues.apache.org/jira/browse/HDFS-7578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521467#comment-14521467 ] Allen Wittenauer commented on HDFS-7578: bq. It's also strange that my comment triggered Jenkins. Is that expected with the new test script? Yup. That part of the pipeline is before test-patch.sh. It's always been that way. NFS WRITE and COMMIT responses should always use the channel pipeline - Key: HDFS-7578 URL: https://issues.apache.org/jira/browse/HDFS-7578 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-7578.001.patch, HDFS-7578.002.patch Write and Commit responses directly write data to the channel instead of propagating it to the next immediate handler in the channel pipeline. Not following Netty channel pipeline model could be problematic. We don't know whether it could cause any resource leak or performance issue especially the internal pipeline implementation keeps changing with newer Netty releases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem
[ https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8299: -- Description: Fsck shows missing blocks when the blocks can be found on a datanode's filesystem and the datanode has been restarted to try to get it to recognize that the blocks are indeed present and hence report them to the NameNode in a block report. Fsck output showing an example missing block: {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT blockpool BP-120244285-ip-1417023863606 block blk_1075202330 MISSING 1 blocks of total size 3260848 B 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 MISSING!{code} The block is definitely present on more than one datanode however, here is the output from one of them that I restarted to try to get it to report the block to the NameNode: {code}# ll /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330* -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330 -rw-r--r-- 1 hdfs 499 25483 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code} It's worth noting that this is on HDFS tiered storage on an archive tier going to a networked block device that may have become temporarily unavailable but is available now. See also feature request HDFS-8297 for online rescan to not have to go around restarting datanodes. It turns out in the datanode log (that I am attaching) this is because the datanode fails to get a write lock on the filesystem. I think it would be better to be able to read-only those blocks however, since this way causes client visible data unavailability when the data could in fact be read. {code}2015-04-30 14:11:08,235 WARN datanode.DataNode (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir /archive1/dn : org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /archive1/dn at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) {code} Hari Sekhon http://www.linkedin.com/in/harisekhon HDFS reporting missing blocks when they are actually present due to read-only filesystem Key: HDFS-8299 URL: https://issues.apache.org/jira/browse/HDFS-8299 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Critical Attachments: datanode.log Fsck shows missing blocks when the blocks can be found on a datanode's filesystem and the datanode has been restarted to try to get it to recognize that the blocks are indeed present and hence report them to the NameNode in a block report. Fsck output showing an example missing block: {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT blockpool BP-120244285-ip-1417023863606 block blk_1075202330 MISSING 1 blocks of total size 3260848 B 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 MISSING!{code} The block is definitely present on more than one datanode however, here is the output from one of them that I restarted to try to get it to report the block to the NameNode: {code}# ll /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Attachment: (was: HDFS-8242-HDFS-7285.05.patch) Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521407#comment-14521407 ] Hudson commented on HDFS-5574: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/]) HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Attachment: HDFS-8242-HDFS-7285.05.patch Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
[ https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521401#comment-14521401 ] Hudson commented on HDFS-8269: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/]) HDFS-8269. getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime. Contributed by Haohui Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime - Key: HDFS-8269 URL: https://issues.apache.org/jira/browse/HDFS-8269 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Assignee: Haohui Mai Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, HDFS-8269.002.patch, HDFS-8269.003.patch When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, it uses the path passed from the client, which generates incorrect edit logs entries: {noformat} RECORD OPCODEOP_TIMES/OPCODE DATA TXID5085/TXID LENGTH0/LENGTH PATH/.reserved/.inodes/18230/PATH MTIME-1/MTIME ATIME1429908236392/ATIME /DATA /RECORD {noformat} Note that the NN does not resolve the {{/.reserved}} path when processing the edit log, therefore it eventually leads to a NPE when loading the edit logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521409#comment-14521409 ] Hudson commented on HDFS-8214: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/]) HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. Contributed by Charles Lamb. (wang: rev aa22450442ebe39916a6fd460fe97e347945526d) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.8.0 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement
[ https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521408#comment-14521408 ] Hudson commented on HDFS-8283: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/]) HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java DataStreamer cleanup and some minor improvement --- Key: HDFS-8283 URL: https://issues.apache.org/jira/browse/HDFS-8283 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.8.0 Attachments: h8283_20150428.patch - When throwing an exception -* always set lastException -* always creating a new exception so that it has the new stack trace - Add LOG. - Add final to isAppend and favoredNodes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Attachment: HDFS-8242-HDFS-7285.05.patch Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
[ https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521446#comment-14521446 ] Hudson commented on HDFS-8269: -- FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/913/]) HDFS-8269. getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime. Contributed by Haohui Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime - Key: HDFS-8269 URL: https://issues.apache.org/jira/browse/HDFS-8269 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Assignee: Haohui Mai Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, HDFS-8269.002.patch, HDFS-8269.003.patch When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, it uses the path passed from the client, which generates incorrect edit logs entries: {noformat} RECORD OPCODEOP_TIMES/OPCODE DATA TXID5085/TXID LENGTH0/LENGTH PATH/.reserved/.inodes/18230/PATH MTIME-1/MTIME ATIME1429908236392/ATIME /DATA /RECORD {noformat} Note that the NN does not resolve the {{/.reserved}} path when processing the edit log, therefore it eventually leads to a NPE when loading the edit logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521455#comment-14521455 ] Hudson commented on HDFS-8214: -- FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/913/]) HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. Contributed by Charles Lamb. (wang: rev aa22450442ebe39916a6fd460fe97e347945526d) * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.8.0 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521453#comment-14521453 ] Hudson commented on HDFS-5574: -- FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/913/]) HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521460#comment-14521460 ] Kai Zheng commented on HDFS-8137: - Hi Uma, bq.We supposed to get schema values from ECSchemaManager, but right now I don't see a better way to get from ECScheaManeger, so I added an API to get from BlockCollection itself like isStriped API in it. {{ECSchemaManager}} might not be supposed to get a schema associated with a zone, dir/file, but {{ErasureCodingZoneManager}} may do. We could query the schema info from a zone using ErasureCodingZoneManager. I thought it's good to add the method {{getECSchema}} along with the existing method {{isStriped}}, as it's essential to erasure coded files. A quick look at the patch found it might need to align with some latest changes, regarding how to get schema from a zone/dir/xAttr, would you double check? Thanks. Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521357#comment-14521357 ] Hudson commented on HDFS-5574: -- FAILURE: Integrated in Hadoop-trunk-Commit #7705 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7705/]) HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521376#comment-14521376 ] Hadoop QA commented on HDFS-7859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 7m 48s | The applied patch generated 10 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 11s | The patch appears to introduce 9 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 239m 34s | Tests failed in hadoop-hdfs. | | | | 288m 5s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time Unsynchronized access at DFSOutputStream.java:90% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 105] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:to long in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed() At BlockInfoStriped.java:[line 208] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 116] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:[line 167] | | Failed unit tests | hadoop.hdfs.server.namenode.TestMetadataVersionOutput | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestDFSRollback | | | hadoop.hdfs.server.namenode.TestCreateEditsLog | | | hadoop.hdfs.protocol.TestLayoutVersion | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.server.namenode.TestDeleteRace | | |
[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem
[ https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8299: -- Attachment: datanode.log HDFS reporting missing blocks when they are actually present due to read-only filesystem Key: HDFS-8299 URL: https://issues.apache.org/jira/browse/HDFS-8299 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Environment: Fsck shows missing blocks when the blocks can be found on a datanode's filesystem and the datanode has been restarted to try to get it to recognize that the blocks are indeed present and hence report them to the NameNode in a block report. Fsck output showing an example missing block: {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT blockpool BP-120244285-ip-1417023863606 block blk_1075202330 MISSING 1 blocks of total size 3260848 B 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 MISSING!{code} The block is definitely present on more than one datanode however, here is the output from one of them that I restarted to try to get it to report the block to the NameNode: {code}# ll /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330* -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330 -rw-r--r-- 1 hdfs 499 25483 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code} It's worth noting that this is on HDFS tiered storage on an archive tier going to a networked block device that may have become temporarily unavailable but is available now. See also feature request HDFS-8297 for online rescan to not have to go around restarting datanodes. It turns out in the datanode log (that I am attaching) this is because the datanode fails to get a write lock on the filesystem. I think it would be better to be able to read-only those blocks however, since this way causes client visible data unavailability when the data could in fact be read. {code}2015-04-30 14:11:08,235 WARN datanode.DataNode (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir /archive1/dn : org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /archive1/dn at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) {code} Hari Sekhon http://www.linkedin.com/in/harisekhon Reporter: Hari Sekhon Priority: Critical Attachments: datanode.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem
[ https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8299: -- Environment: HDP 2.2 (was: Fsck shows missing blocks when the blocks can be found on a datanode's filesystem and the datanode has been restarted to try to get it to recognize that the blocks are indeed present and hence report them to the NameNode in a block report. Fsck output showing an example missing block: {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT blockpool BP-120244285-ip-1417023863606 block blk_1075202330 MISSING 1 blocks of total size 3260848 B 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 MISSING!{code} The block is definitely present on more than one datanode however, here is the output from one of them that I restarted to try to get it to report the block to the NameNode: {code}# ll /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330* -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330 -rw-r--r-- 1 hdfs 499 25483 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code} It's worth noting that this is on HDFS tiered storage on an archive tier going to a networked block device that may have become temporarily unavailable but is available now. See also feature request HDFS-8297 for online rescan to not have to go around restarting datanodes. It turns out in the datanode log (that I am attaching) this is because the datanode fails to get a write lock on the filesystem. I think it would be better to be able to read-only those blocks however, since this way causes client visible data unavailability when the data could in fact be read. {code}2015-04-30 14:11:08,235 WARN datanode.DataNode (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir /archive1/dn : org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /archive1/dn at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) {code} Hari Sekhon http://www.linkedin.com/in/harisekhon) HDFS reporting missing blocks when they are actually present due to read-only filesystem Key: HDFS-8299 URL: https://issues.apache.org/jira/browse/HDFS-8299 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Critical Attachments: datanode.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem
Hari Sekhon created HDFS-8299: - Summary: HDFS reporting missing blocks when they are actually present due to read-only filesystem Key: HDFS-8299 URL: https://issues.apache.org/jira/browse/HDFS-8299 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Environment: Fsck shows missing blocks when the blocks can be found on a datanode's filesystem and the datanode has been restarted to try to get it to recognize that the blocks are indeed present and hence report them to the NameNode in a block report. Fsck output showing an example missing block: {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT blockpool BP-120244285-ip-1417023863606 block blk_1075202330 MISSING 1 blocks of total size 3260848 B 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 MISSING!{code} The block is definitely present on more than one datanode however, here is the output from one of them that I restarted to try to get it to report the block to the NameNode: {code}# ll /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330* -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330 -rw-r--r-- 1 hdfs 499 25483 Apr 27 15:02 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code} It's worth noting that this is on HDFS tiered storage on an archive tier going to a networked block device that may have become temporarily unavailable but is available now. See also feature request HDFS-8297 for online rescan to not have to go around restarting datanodes. It turns out in the datanode log (that I am attaching) this is because the datanode fails to get a write lock on the filesystem. I think it would be better to be able to read-only those blocks however, since this way causes client visible data unavailability when the data could in fact be read. {code}2015-04-30 14:11:08,235 WARN datanode.DataNode (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir /archive1/dn : org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /archive1/dn at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) {code} Hari Sekhon http://www.linkedin.com/in/harisekhon Reporter: Hari Sekhon Priority: Critical Attachments: datanode.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521356#comment-14521356 ] Hudson commented on HDFS-7770: -- FAILURE: Integrated in Hadoop-trunk-Commit #7705 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7705/]) HDFS-7770. Need document for storage type label of data node storage locations under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev de9404f02f36bf9a1100c67f41db907d494bb9ed) * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521397#comment-14521397 ] Hudson commented on HDFS-5574: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/]) HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
[ https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521391#comment-14521391 ] Hudson commented on HDFS-8269: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/]) HDFS-8269. getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime. Contributed by Haohui Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime - Key: HDFS-8269 URL: https://issues.apache.org/jira/browse/HDFS-8269 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Assignee: Haohui Mai Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, HDFS-8269.002.patch, HDFS-8269.003.patch When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, it uses the path passed from the client, which generates incorrect edit logs entries: {noformat} RECORD OPCODEOP_TIMES/OPCODE DATA TXID5085/TXID LENGTH0/LENGTH PATH/.reserved/.inodes/18230/PATH MTIME-1/MTIME ATIME1429908236392/ATIME /DATA /RECORD {noformat} Note that the NN does not resolve the {{/.reserved}} path when processing the edit log, therefore it eventually leads to a NPE when loading the edit logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521399#comment-14521399 ] Hudson commented on HDFS-8214: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/]) HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. Contributed by Charles Lamb. (wang: rev aa22450442ebe39916a6fd460fe97e347945526d) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.8.0 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement
[ https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521398#comment-14521398 ] Hudson commented on HDFS-8283: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/]) HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java DataStreamer cleanup and some minor improvement --- Key: HDFS-8283 URL: https://issues.apache.org/jira/browse/HDFS-8283 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.8.0 Attachments: h8283_20150428.patch - When throwing an exception -* always set lastException -* always creating a new exception so that it has the new stack trace - Add LOG. - Add final to isAppend and favoredNodes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521394#comment-14521394 ] Hudson commented on HDFS-7770: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/]) HDFS-7770. Need document for storage type label of data node storage locations under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev de9404f02f36bf9a1100c67f41db907d494bb9ed) * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521434#comment-14521434 ] Rakesh R commented on HDFS-8242: Attaching another patch fixing whitespace problem reported by jenkins Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521404#comment-14521404 ] Hudson commented on HDFS-7770: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/]) HDFS-7770. Need document for storage type label of data node storage locations under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev de9404f02f36bf9a1100c67f41db907d494bb9ed) * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero
[ https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8276: - Attachment: HDFS-8276_1.patch LazyPersistFileScrubber should be disabled if scrubber interval configured zero --- Key: HDFS-8276 URL: https://issues.apache.org/jira/browse/HDFS-8276 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8276.patch, HDFS-8276_1.patch bq. but I think it is simple enough to change the meaning of the value so that zero means 'never scrub'. Let me post an updated patch. As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], scrubber should be disable if *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero. Currently namenode startup is failing if interval configured zero {code} 2015-04-27 23:47:31,744 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.lang.IllegalArgumentException: dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero
[ https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521416#comment-14521416 ] surendra singh lilhore commented on HDFS-8276: -- Thanks [~arpitagarwal] for review. Attached new patch, added test case. Please review. LazyPersistFileScrubber should be disabled if scrubber interval configured zero --- Key: HDFS-8276 URL: https://issues.apache.org/jira/browse/HDFS-8276 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8276.patch, HDFS-8276_1.patch bq. but I think it is simple enough to change the meaning of the value so that zero means 'never scrub'. Let me post an updated patch. As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], scrubber should be disable if *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero. Currently namenode startup is failing if interval configured zero {code} 2015-04-27 23:47:31,744 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.lang.IllegalArgumentException: dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8298) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary failures
Hari Sekhon created HDFS-8298: - Summary: HA: NameNode should not shut down completely without quorum, doesn't recover from temporary failures Key: HDFS-8298 URL: https://issues.apache.org/jira/browse/HDFS-8298 Project: Hadoop HDFS Issue Type: Improvement Components: ha, HDFS, namenode, qjm Affects Versions: 2.6.0 Environment: HDP 2.2 Reporter: Hari Sekhon In an HDFS HA setup if there is a temporary problem with contacting journal nodes (eg. network interruption), the NameNode shuts down entirely, when it should instead go in to a standby mode so that it can stay online and retry to achieve quorum later. If both NameNodes shut themselves off like this then even after the temporary network outage is resolved, the entire cluster remains offline indefinitely until operator intervention, whereas it could have self-repaired after re-contacting the journalnodes and re-achieving quorum. {code}2015-04-15 15:59:26,900 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStre am(mgr=QJM to [ip:8485, ip:8485, ip:8485], stream=QuorumOutputStream starting at txid 54270281)) java.io.IOException: Interrupted waiting 2ms for a quorum of nodes to respond. at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:134) at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639) at org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:388) at java.lang.Thread.run(Thread.java:745) 2015-04-15 15:59:26,901 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 54270281 2015-04-15 15:59:26,904 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2015-04-15 15:59:27,001 INFO namenode.NameNode (StringUtils.java:run(659)) - SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at custom_scrubbed/ip /{code} Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
[ https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521381#comment-14521381 ] Hudson commented on HDFS-8269: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/]) HDFS-8269. getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime. Contributed by Haohui Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime - Key: HDFS-8269 URL: https://issues.apache.org/jira/browse/HDFS-8269 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Assignee: Haohui Mai Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, HDFS-8269.002.patch, HDFS-8269.003.patch When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, it uses the path passed from the client, which generates incorrect edit logs entries: {noformat} RECORD OPCODEOP_TIMES/OPCODE DATA TXID5085/TXID LENGTH0/LENGTH PATH/.reserved/.inodes/18230/PATH MTIME-1/MTIME ATIME1429908236392/ATIME /DATA /RECORD {noformat} Note that the NN does not resolve the {{/.reserved}} path when processing the edit log, therefore it eventually leads to a NPE when loading the edit logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521384#comment-14521384 ] Hudson commented on HDFS-7770: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/]) HDFS-7770. Need document for storage type label of data node storage locations under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev de9404f02f36bf9a1100c67f41db907d494bb9ed) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement
[ https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521388#comment-14521388 ] Hudson commented on HDFS-8283: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/]) HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java DataStreamer cleanup and some minor improvement --- Key: HDFS-8283 URL: https://issues.apache.org/jira/browse/HDFS-8283 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.8.0 Attachments: h8283_20150428.patch - When throwing an exception -* always set lastException -* always creating a new exception so that it has the new stack trace - Add LOG. - Add final to isAppend and favoredNodes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip
[ https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521387#comment-14521387 ] Hudson commented on HDFS-5574: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/]) HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java Remove buffer copy in BlockReader.skip -- Key: HDFS-5574 URL: https://issues.apache.org/jira/browse/HDFS-5574 Project: Hadoop HDFS Issue Type: Improvement Reporter: Binglin Chang Assignee: Binglin Chang Priority: Trivial Fix For: 2.8.0 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read data to this buffer, it is not necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521389#comment-14521389 ] Hudson commented on HDFS-8214: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/]) HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. Contributed by Charles Lamb. (wang: rev aa22450442ebe39916a6fd460fe97e347945526d) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Fix For: 2.8.0 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero
[ https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521414#comment-14521414 ] surendra singh lilhore commented on HDFS-8276: -- Failed test cases and find bugs are not related to this jira. LazyPersistFileScrubber should be disabled if scrubber interval configured zero --- Key: HDFS-8276 URL: https://issues.apache.org/jira/browse/HDFS-8276 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8276.patch, HDFS-8276_1.patch bq. but I think it is simple enough to change the meaning of the value so that zero means 'never scrub'. Let me post an updated patch. As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], scrubber should be disable if *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero. Currently namenode startup is failing if interval configured zero {code} 2015-04-27 23:47:31,744 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.lang.IllegalArgumentException: dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8298) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary network outages
[ https://issues.apache.org/jira/browse/HDFS-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8298: -- Summary: HA: NameNode should not shut down completely without quorum, doesn't recover from temporary network outages (was: HA: NameNode should not shut down completely without quorum, doesn't recover from temporary failures) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary network outages --- Key: HDFS-8298 URL: https://issues.apache.org/jira/browse/HDFS-8298 Project: Hadoop HDFS Issue Type: Improvement Components: ha, HDFS, namenode, qjm Affects Versions: 2.6.0 Environment: HDP 2.2 Reporter: Hari Sekhon In an HDFS HA setup if there is a temporary problem with contacting journal nodes (eg. network interruption), the NameNode shuts down entirely, when it should instead go in to a standby mode so that it can stay online and retry to achieve quorum later. If both NameNodes shut themselves off like this then even after the temporary network outage is resolved, the entire cluster remains offline indefinitely until operator intervention, whereas it could have self-repaired after re-contacting the journalnodes and re-achieving quorum. {code}2015-04-15 15:59:26,900 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStre am(mgr=QJM to [ip:8485, ip:8485, ip:8485], stream=QuorumOutputStream starting at txid 54270281)) java.io.IOException: Interrupted waiting 2ms for a quorum of nodes to respond. at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:134) at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639) at org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:388) at java.lang.Thread.run(Thread.java:745) 2015-04-15 15:59:26,901 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 54270281 2015-04-15 15:59:26,904 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2015-04-15 15:59:27,001 INFO namenode.NameNode (StringUtils.java:run(659)) - SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at custom_scrubbed/ip /{code} Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir
[ https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521450#comment-14521450 ] Hudson commented on HDFS-7770: -- FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/913/]) HDFS-7770. Need document for storage type label of data node storage locations under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev de9404f02f36bf9a1100c67f41db907d494bb9ed) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml Need document for storage type label of data node storage locations under dfs.data.dir -- Key: HDFS-7770 URL: https://issues.apache.org/jira/browse/HDFS-7770 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.8.0, 2.7.1 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, HDFS-7770.02.patch HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN as a collection of storages with different types. However, I can't find document on how to label different storage types from the following two documents. I found the information from the design spec. It will be good we document this for admins and users to use the related Archival storage and storage policy features. http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml This JIRA is opened to add document for the new storage type labels. 1. Add an example under ArchivalStorage.html#Configuration section: {code} property namedfs.data.dir/name value[DISK]file:///hddata/dn/disk0, [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value /property {code} 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage type is labeled in the data node storage location configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement
[ https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521454#comment-14521454 ] Hudson commented on HDFS-8283: -- FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/913/]) HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java DataStreamer cleanup and some minor improvement --- Key: HDFS-8283 URL: https://issues.apache.org/jira/browse/HDFS-8283 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.8.0 Attachments: h8283_20150428.patch - When throwing an exception -* always set lastException -* always creating a new exception so that it has the new stack trace - Add LOG. - Add final to isAppend and favoredNodes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8277) Safemode enter fails when Standby NameNode is down
[ https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8277: - Attachment: HDFS-8277_2.patch Safemode enter fails when Standby NameNode is down -- Key: HDFS-8277 URL: https://issues.apache.org/jira/browse/HDFS-8277 Project: Hadoop HDFS Issue Type: Bug Components: ha, HDFS, namenode Affects Versions: 2.6.0 Environment: HDP 2.2.0 Reporter: Hari Sekhon Assignee: surendra singh lilhore Priority: Minor Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch HDFS fails to enter safemode when the Standby NameNode is down (eg. due to AMBARI-10536). {code}hdfs dfsadmin -safemode enter safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused{code} This appears to be a bug in that it's not trying both NameNodes like the standard hdfs client code does, and is instead stopping after getting a connection refused from nn1 which is down. I verified normal hadoop fs writes and reads via cli did work at this time, using nn2. I happened to run this command as the hdfs user on nn2 which was the surviving Active NameNode. After I re-bootstrapped the Standby NN to fix it the command worked as expected again. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down
[ https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521238#comment-14521238 ] surendra singh lilhore commented on HDFS-8277: -- Attached new patch with test case, Please review Safemode enter fails when Standby NameNode is down -- Key: HDFS-8277 URL: https://issues.apache.org/jira/browse/HDFS-8277 Project: Hadoop HDFS Issue Type: Bug Components: ha, HDFS, namenode Affects Versions: 2.6.0 Environment: HDP 2.2.0 Reporter: Hari Sekhon Assignee: surendra singh lilhore Priority: Minor Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch HDFS fails to enter safemode when the Standby NameNode is down (eg. due to AMBARI-10536). {code}hdfs dfsadmin -safemode enter safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused{code} This appears to be a bug in that it's not trying both NameNodes like the standard hdfs client code does, and is instead stopping after getting a connection refused from nn1 which is down. I verified normal hadoop fs writes and reads via cli did work at this time, using nn2. I happened to run this command as the hdfs user on nn2 which was the surviving Active NameNode. After I re-bootstrapped the Standby NN to fix it the command worked as expected again. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-219) Add md5sum facility in dfsshell
[ https://issues.apache.org/jira/browse/HDFS-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521224#comment-14521224 ] Kengo Seki commented on HDFS-219: - Maybe duplicate of HADOOP-9209? Add md5sum facility in dfsshell --- Key: HDFS-219 URL: https://issues.apache.org/jira/browse/HDFS-219 Project: Hadoop HDFS Issue Type: New Feature Reporter: zhangwei Labels: newbie I think it would be usefull to add md5sum (or anyone else) to dfsshell ,and the facility can verify the file on hdfs.It can confirm the file is integrity after copyFromLocal or copyToLocal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7810) Datanode registration process fails in hadoop 2.6
[ https://issues.apache.org/jira/browse/HDFS-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521516#comment-14521516 ] Vlad Frolov commented on HDFS-7810: --- It seems that I hit the same issue here. I have properly set DNS server (bind9) with reverse DNS lookups, `nslookup` and `host` utilities can resolve IP into FQDN, but NameNode says {code:log} 15/04/30 13:45:12 WARN blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=10.250.10.11, hostname=10.250.10.11) 15/04/30 13:45:12 INFO ipc.Server: IPC Server handler 3 on 8020, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 10.250.10.11:35776 Call#68 Retry#0 org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=10.250.10.11, hostname=10.250.10.11): DatanodeRegistration(0.0.0.0, datanodeUuid=d5fe1cf5-09ac-4644-9f8a-8c4881e3c569, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-e74a3224-300e-400b-ae92-bb7ae64cdf01;nsid=1242366503;c=0) {code} Datanode registration process fails in hadoop 2.6 -- Key: HDFS-7810 URL: https://issues.apache.org/jira/browse/HDFS-7810 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Environment: ubuntu 12 Reporter: Biju Nair Labels: hadoop When a new DN is added to the cluster, the registration process fails. The following are the steps followed. - Install and start a new DN - Add entry for the DN in the NN {{/etc/hosts}} file DN log shows that the registration process failed - Tried to restart DN with the same result Since all the DNs have multiple NW interface, we are using the following {{hdfs-site.xml}} property, instead of listing all the {{dfs.datanode.xx.address}} properties. {code:xml} property namedfs.datanode.dns.interface/name valueeth2/value /property {code} - Restarting the NN resolves the issue with registration which is not desired. - Adding the following {{dfs.datanode.xx.address}} properties seem to resolve DN registration process without NN restart. But this is a different behavior compared to *hadoop 2.2*. Is there a reason for the change? {code:xml} property namedfs.datanode.address/name value192.168.0.12:50010/value /property property namedfs.datanode.ipc.address/name value192.168.0.12:50020/value /property property namedfs.datanode.http.address/name value192.168.0.12:50075/value /property {code} *NN Log Error Entry* {quote} 2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 8020, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.100.13:37516 Call#1027 Retry#0 org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 2015-02-17 12:21:58,607 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.100.13, hostname=192.168.100.13) {quote} *DN Log Error Entry* {quote} 2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block
[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery
[ https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521517#comment-14521517 ] Kai Zheng commented on HDFS-7348: - Thanks [~hitliuyi], [~zhz] and [~libo-intel] for the great discussion! Looks like we have already come to good plans. bq.Does it save CPU to decode in big chunks? Kai Zheng Could you advise? Sorry I just noticed this. Yes you're right, as Yi also noted, it's good to allocate big native buffers for ISA-L coders to outperform greatly. We have test data that indicate using about 32MB chunk size ISA-L coder can work the very best. I agree it's good to decouple sync-and-decode unit from the chunk/cell size in a schema, and make it configurable. Yes it might not be good to do it in the entire block level as doing so may make DN exhaust in memory and not reliable. We should be able to enforce a memory usage threshold limit for recovery tasks. As some dedicated DNs have powerful CPU cores it's good to distribute recovery work to them, so very likely on such DNs there are more than one recovering tasks concurrently executing. Erasure Coding: striped block recovery -- Key: HDFS-7348 URL: https://issues.apache.org/jira/browse/HDFS-7348 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Kai Zheng Assignee: Yi Liu Attachments: ECWorker.java, HDFS-7348.001.patch This JIRA is to recover one or more missed striped block in the striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down
[ https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521549#comment-14521549 ] Hadoop QA commented on HDFS-8277: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 52s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | javac | 8m 15s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 7s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 4m 57s | There were no new checkstyle issues. | | {color:green}+1{color} | install | 1m 56s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 37s | The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 44s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 166m 39s | Tests failed in hadoop-hdfs. | | | | 216m 18s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | Failed unit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshot | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.hdfs.qjournal.TestSecureNNWithQJM | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.tools.TestDFSAdminWithHA | | | hadoop.hdfs.TestClose | | Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol | | | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729456/HDFS-8277_2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e89fc53 | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/10476/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10476/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10476/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10476/console | This message was automatically generated. Safemode enter fails when Standby NameNode is down -- Key: HDFS-8277 URL: https://issues.apache.org/jira/browse/HDFS-8277 Project: Hadoop HDFS Issue Type: Bug Components: ha, HDFS, namenode Affects Versions: 2.6.0 Environment: HDP 2.2.0 Reporter: Hari Sekhon Assignee: surendra singh lilhore Priority: Minor Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch HDFS fails to enter safemode when the Standby NameNode is down (eg. due to AMBARI-10536). {code}hdfs dfsadmin -safemode enter safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused{code} This appears to be a bug in that it's not trying both NameNodes like the standard hdfs client code does, and is instead stopping after getting a connection refused from nn1 which is down. I verified normal hadoop fs writes and reads via cli did work at this time, using nn2. I happened to run this command as the hdfs user on nn2 which was the surviving Active NameNode. After I re-bootstrapped the Standby NN to fix it the command worked as expected again. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This
[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521577#comment-14521577 ] Hadoop QA commented on HDFS-8229: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 28s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | javac | 7m 25s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 5m 27s | The applied patch generated 1 additional checkstyle issues. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 7s | The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 225m 54s | Tests failed in hadoop-hdfs. | | | | 271m 34s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.TestClose | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.server.namenode.TestSaveNamespace | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache | | | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | org.apache.hadoop.hdfs.TestDataTransferProtocol | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729442/HDFS-8229_2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f5b3847 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/checkstyle-result-diff.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10475/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10475/console | This message was automatically generated. LAZY_PERSIST file gets deleted after NameNode restart. -- Key: HDFS-8229 URL: https://issues.apache.org/jira/browse/HDFS-8229 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch {code} 2015-04-20 10:26:55,180 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist file /LAZY_PERSIST/smallfile with no replicas. {code} After namenode restart and before DN's registration if {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521585#comment-14521585 ] Uma Maheswara Rao G commented on HDFS-8137: --- Thanks a lot for the review Kai. Good catch. You are right, we are storing in xattrs along with zone. {quote} ECSchemaManager might not be supposed to get a schema associated with a zone, dir/file, but ErasureCodingZoneManager may do. {quote} By mistake I said as ECSchemaManager. Your are right, I should have said as ErasureCodingZoneManager as it has that related code what I was talking. Also I added the getECSchema API in namesystem itself as we have already added some ECSchema related API in FSNameSystem. For reusing the codes from ECZoneManager codes, keeping this new API in namesystem would give us the flexibility, but we can not get the same flexibility from BlockCollection as we can not access FSDirectory details there. Please check if the latest patch make sense for you? Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-8137: -- Attachment: HDFS-8137-1.patch Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7949: Attachment: HDFS-7949-HDFS-7285.08.patch Thanks Rakesh for taking a close look. I'm attaching a dup patch just to be extra careful, since the space calculation _could_ affect other tests. WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch, HDFS-7949-HDFS-7285.08.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521860#comment-14521860 ] Arpit Agarwal commented on HDFS-8229: - +1 for the patch. Thanks for the updates [~surendrasingh]. I kicked off another pre-commit build since previous results look wrong. LAZY_PERSIST file gets deleted after NameNode restart. -- Key: HDFS-8229 URL: https://issues.apache.org/jira/browse/HDFS-8229 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch {code} 2015-04-20 10:26:55,180 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist file /LAZY_PERSIST/smallfile with no replicas. {code} After namenode restart and before DN's registration if {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah reassigned HDFS-8224: Assignee: Rushabh S Shah Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error Key: HDFS-8224 URL: https://issues.apache.org/jira/browse/HDFS-8224 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Fix For: 2.8.0 This happened in our 2.6 cluster. One of the block and its metadata file were corrupted. The disk was healthy in this case. Only the block was corrupt. Namenode tried to copy that block to another datanode but failed with the following stack trace: 2015-04-20 01:04:04,421 [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN datanode.DataNode: DatanodeRegistration(a.b.c.d, datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, infoSecurePort=0, ipcPort=8020, storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to a1.b1.c1.d1:1004 got java.io.IOException: Could not create DataChecksum of type 0 with bytesPerChecksum 0 at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:287) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) at java.lang.Thread.run(Thread.java:722) The following catch block in DataTransfer#run method will treat every IOException as disk error fault and run disk errror {noformat} catch (IOException ie) { LOG.warn(bpReg + :Failed to transfer + b + to + targets[0] + got , ie); // check if there are any disk problem checkDiskErrorAsync(); } {noformat} This block was never scanned by BlockPoolSliceScanner otherwise it would have reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)