[jira] [Created] (HDFS-11709) StandbyCheckpointer should handle an non-existing legacyOivImageDir gracefully
Zhe Zhang created HDFS-11709: Summary: StandbyCheckpointer should handle an non-existing legacyOivImageDir gracefully Key: HDFS-11709 URL: https://issues.apache.org/jira/browse/HDFS-11709 Project: Hadoop HDFS Issue Type: Bug Components: ha, namenode Affects Versions: 2.6.1 Reporter: Zhe Zhang Assignee: Erik Krogen Priority: Critical In {{StandbyCheckpointer}}, if the legacy OIV directory is not properly created, or was deleted for some reason (e.g. mis-operation), all checkpoint ops will fall. Not only the ANN won't receive new fsimages, the JNs will get full with edit log files, and cause NN to crash. {code} // Save the legacy OIV image, if the output dir is defined. String outputDir = checkpointConf.getLegacyOivImageDir(); if (outputDir != null && !outputDir.isEmpty()) { img.saveLegacyOIVImage(namesystem, outputDir, canceler); } {code} It doesn't make sense to let such an unimportant part (saving OIV) abort all checkpoints and cause NN crash (and possibly lose data). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-11708) positional read will fail if replicas moved to different DNs after stream is opened
Vinayakumar B created HDFS-11708: Summary: positional read will fail if replicas moved to different DNs after stream is opened Key: HDFS-11708 URL: https://issues.apache.org/jira/browse/HDFS-11708 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.3 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Scenario: 1. File was written to DN1, DN2 with RF=2 2. File stream opened to read and kept. Block Locations are [DN1,DN2] 3. One of the replica (DN2) moved to another datanode (DN3) due to datanode dead/balancing/etc. 4. Latest block locations in NameNode will be DN1 and DN3. 5. DN1 went down, but not yet detected as dead in NameNode. 6. Client start reading using positional read api "read(pos, buf[], offset, length)" -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-8872) Reporting of missing blocks is different in fsck and namenode ui/metasave
[ https://issues.apache.org/jira/browse/HDFS-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang reopened HDFS-8872: - > Reporting of missing blocks is different in fsck and namenode ui/metasave > - > > Key: HDFS-8872 > URL: https://issues.apache.org/jira/browse/HDFS-8872 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > > Namenode ui and metasave will not report a block as missing if the only > replica is on decommissioning/decomissioned node while fsck will show it as > MISSING. > Since decommissioned node can be formatted/removed anytime, we can actually > lose the block. > Its better to alert on namenode ui if the only copy is on > decomissioned/decommissioning node. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-8872) Reporting of missing blocks is different in fsck and namenode ui/metasave
[ https://issues.apache.org/jira/browse/HDFS-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved HDFS-8872. - Resolution: Duplicate > Reporting of missing blocks is different in fsck and namenode ui/metasave > - > > Key: HDFS-8872 > URL: https://issues.apache.org/jira/browse/HDFS-8872 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > > Namenode ui and metasave will not report a block as missing if the only > replica is on decommissioning/decomissioned node while fsck will show it as > MISSING. > Since decommissioned node can be formatted/removed anytime, we can actually > lose the block. > Its better to alert on namenode ui if the only copy is on > decomissioned/decommissioning node. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: About 2.7.4 Release
> On Apr 25, 2017, at 12:35 AM, Akira Ajisaka wrote: > > Maybe we should create a jira to track this? > > I think now either way (reopen or create) is fine. > > Release doc maker creates change logs by fetching information from JIRA, so > reopening the tickets should be avoided when a release process is in progress. > Keep in mind that the release documentation is part of the build process. Users who are doing their own builds will have incomplete documentation if we keep re-opening JIRAs after a release. At one point, JIRA was configured to refuse re-opening after a release is cut. I'm not sure why it stopped doing that, but it might be time to see if we can re-enable that functionality. - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-11707) TestDirectoryScanner#testThrottling fails on OSX
Erik Krogen created HDFS-11707: -- Summary: TestDirectoryScanner#testThrottling fails on OSX Key: HDFS-11707 URL: https://issues.apache.org/jira/browse/HDFS-11707 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.8.0 Reporter: Erik Krogen Priority: Minor In branch-2 and trunk, {{TestDirectoryScanner#testThrottling}} consistently fails on OS X (I'm running 10.11 specifically) with: {code} java.lang.AssertionError: Throttle is too permissive {code} It seems to work alright on Unix systems. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-11706) Enable fallback to regular distcp when distcp failed with snapshot diff
Yongjun Zhang created HDFS-11706: Summary: Enable fallback to regular distcp when distcp failed with snapshot diff Key: HDFS-11706 URL: https://issues.apache.org/jira/browse/HDFS-11706 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yongjun Zhang When snapshot based distcp failed (-diff), it used to fallback to regular distcp. However, the fallback was disabled by HDFS-10313, for couple of reasons: # Safety reason. For example, if user passed wrong parameter to the command (especially snapshot name), the sync step could fail. # -diff doesn't allow -delete option, which means, even if we fallback to regular distcp, distcp doesn't know whether -delete should be applied. There are two possible approaches to solve this problem: * introduce a new command line switch, to tell the fallback run whether to enable -delete * let the command line option parser to remember whether -delete was passed initially. If -delete was passed, disable -delete when -diff is passed, then re-enable -delete when fallback. This jira is to implement one of these approaches. This applies to -rdiff too. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-11705) BUG: Inconsistent storagespace for directory
Wei-Chiu Chuang created HDFS-11705: -- Summary: BUG: Inconsistent storagespace for directory Key: HDFS-11705 URL: https://issues.apache.org/jira/browse/HDFS-11705 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Affects Versions: 3.0.0-alpha2 Reporter: Wei-Chiu Chuang I was running a a test TestRenameWithSnapshots.testDu, and found an error in the log: {noformat} 2017-04-26 05:07:30,229 [IPC Server handler 7 on 52578] ERROR namenode.NameNode (DirectoryWithQuotaFeature.java:checkStoragespace(141)) - BUG: Inconsistent storagespace for directory /testDu. Cached = 6144 != Computed = 3072 {noformat} The test completed without failure nonetheless. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org