[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425311#comment-15425311 ] Hadoop QA commented on HDFS-8402: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-8402 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12732951/HDFS-8402.patch | | JIRA Issue | HDFS-8402 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16458/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Fsck exit codes are not reliable > > > Key: HDFS-8402 > URL: https://issues.apache.org/jira/browse/HDFS-8402 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-8402.patch > > > HDFS-6663 added the ability to check specific blocks. The exit code is > non-deterministically based on the state (corrupt, healthy, etc) of the last > displayed block's last storage location - instead of whether any of the > checked blocks' storages are corrupt. Blocks with decommissioning or > decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058286#comment-15058286 ] Kihwal Lee commented on HDFS-8402: -- The changes looks good, but it does not apply to trunk anymore. Can you refresh the patch? > Fsck exit codes are not reliable > > > Key: HDFS-8402 > URL: https://issues.apache.org/jira/browse/HDFS-8402 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-8402.patch > > > HDFS-6663 added the ability to check specific blocks. The exit code is > non-deterministically based on the state (corrupt, healthy, etc) of the last > displayed block's last storage location - instead of whether any of the > checked blocks' storages are corrupt. Blocks with decommissioning or > decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548469#comment-14548469 ] Daryn Sharp commented on HDFS-8402: --- The block check feature was only added in 2.7.0. I'll elaborate further on why the exit code for the new feature is completely broken, and I'd say incompatible with posix standards and user expectations. The existing path scan output has a last line of The filesystem ... is (HEALTHY|CORRUPT). Fsck client solely checks the last line to determine the exit code. Quite fragile but it is what it is. The new block scan does not emit a final line - which this patch adds. The block scan feature naively relies on the final line of Block replica ... is (HEALTHY|CORRUPT). That means fsck returns: # Success even if any of the 1..n-1 scanned blocks have corrupt replicas. It gets worse. # Success even if the last block's 1..n-1 replicas are corrupt. It gets worse. # Success even if the last block's last replica is CORRUPT. Fsck checks endsWith CORRUPT which doesn't match Block replica .. is CORRUPT Reason:blah. # Summary: Always success even if blocks are corrupt. How useful is that? The options for fixing the exit code are either pattern match all lines to determine a block scan's exit code, which makes fsck even more fragile and doesn't fix old clients. Or emit a final line similar to the path scan that allows both old and new fsck clients (compatible?) to return the correct exit code. I chose the later based on it's a 2.7 feature and doubtful anyone is relying on the last line since it means nothing. (The test failure isn't related) Fsck exit codes are not reliable Key: HDFS-8402 URL: https://issues.apache.org/jira/browse/HDFS-8402 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-8402.patch HDFS-6663 added the ability to check specific blocks. The exit code is non-deterministically based on the state (corrupt, healthy, etc) of the last displayed block's last storage location - instead of whether any of the checked blocks' storages are corrupt. Blocks with decommissioning or decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548474#comment-14548474 ] Allen Wittenauer commented on HDFS-8402: An incompatible change is an incompatible change, even if under the best of intentions. Given the complete disaster that branch-2 has turned in to, whether it's an incompatible change or not is sort of irrelevant, imo. Fsck exit codes are not reliable Key: HDFS-8402 URL: https://issues.apache.org/jira/browse/HDFS-8402 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-8402.patch HDFS-6663 added the ability to check specific blocks. The exit code is non-deterministically based on the state (corrupt, healthy, etc) of the last displayed block's last storage location - instead of whether any of the checked blocks' storages are corrupt. Blocks with decommissioning or decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548570#comment-14548570 ] Daryn Sharp commented on HDFS-8402: --- I'm unclear if you are blocking this patch? I feel it's a bit hardline to deem a bug fix for a new feature in 2.7.0 as incompatible. Would you prefer if I ignore the bug and just hack the test to make it pass again? If no, what are my options to move forward? Fsck exit codes are not reliable Key: HDFS-8402 URL: https://issues.apache.org/jira/browse/HDFS-8402 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-8402.patch HDFS-6663 added the ability to check specific blocks. The exit code is non-deterministically based on the state (corrupt, healthy, etc) of the last displayed block's last storage location - instead of whether any of the checked blocks' storages are corrupt. Blocks with decommissioning or decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549120#comment-14549120 ] Allen Wittenauer commented on HDFS-8402: Nope, I'm not blocking the patch. I just want clarity in release notes, etc. in case someone stumbles upon this later because it broke stuff. As a community, we've clearly thrown backward compatibility for ops under a huge bus so it doesn't really matter if this goes into a branch-2 or not. Fsck exit codes are not reliable Key: HDFS-8402 URL: https://issues.apache.org/jira/browse/HDFS-8402 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-8402.patch HDFS-6663 added the ability to check specific blocks. The exit code is non-deterministically based on the state (corrupt, healthy, etc) of the last displayed block's last storage location - instead of whether any of the checked blocks' storages are corrupt. Blocks with decommissioning or decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8402) Fsck exit codes are not reliable
[ https://issues.apache.org/jira/browse/HDFS-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544666#comment-14544666 ] Hadoop QA commented on HDFS-8402: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 31s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 30s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 15s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 3s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 166m 52s | Tests failed in hadoop-hdfs. | | | | 209m 31s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732951/HDFS-8402.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 15ccd96 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10987/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10987/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10987/console | This message was automatically generated. Fsck exit codes are not reliable Key: HDFS-8402 URL: https://issues.apache.org/jira/browse/HDFS-8402 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-8402.patch HDFS-6663 added the ability to check specific blocks. The exit code is non-deterministically based on the state (corrupt, healthy, etc) of the last displayed block's last storage location - instead of whether any of the checked blocks' storages are corrupt. Blocks with decommissioning or decommissioned nodes should not be flagged as an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)