[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377153#comment-16377153 ] Hudson commented on HDFS-12070: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13716 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13716/]) HDFS-12070. Failed block recovery leaves files open indefinitely and at (kihwal: rev 451265a83d8798624ae2a144bc58fa41db826704) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, HDFS-12070.1.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377109#comment-16377109 ] Daryn Sharp commented on HDFS-12070: +1 looks good > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, HDFS-12070.1.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373799#comment-16373799 ] genericqa commented on HDFS-12070: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 31 unchanged - 0 fixed = 32 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}127m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12070 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12911622/HDFS-12070.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 28523be52a12 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 95904f6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/23156/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.t
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373582#comment-16373582 ] Kihwal Lee commented on HDFS-12070: --- Attaching an updated patch that commits right away excluding stage-2 failed nodes. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, HDFS-12070.1.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373276#comment-16373276 ] Kihwal Lee commented on HDFS-12070: --- It sure will be better if we can close the file right away. The design doc specified that one more stage be added for safety as the state of stage-2 failed replica is unknown. Let's examine whether any "unknown state" causes an issue after being excluded in the commit/close. If any stage-2 failure occurs before updating the gen stamp, there will be no issue. So we can safely assume that closing right away will create no problems for early stage-2 failure cases. After a series of checks, 1) the in memory gen stamp of the RUR replica is updated. 2) the meta file is renamed accordingly. 3) the data file is truncated if necessary 4) the size is set in the RUR replica in memory. 5) finalize the replica (move under the finalized dir, create new ReplicaInfo), and then add to the replica map. 6) check replica files (moves succeeded, but double checking). 7) explicitly send a RECEIVED IBR. This is asynchronous, fire-and-forget. - A failure in 1) or right after 1) is not an issue. It will stay as RUR until the DN restarts, so the replica won't be reported to the NN. After the DN restarts, it will turn into RWR/RBW and the genstamp will revert to what's on this. It will get cleaned up. - A failure between 2) and 4) can update the on-disk gen stamp to the committed one, but the replica will stay as RUR. - A failure between 5) and 6) might leave the on-disk data inconsistent and unusable. If fails before any rename/move, the files remain in the rbw directory and RUR in memory. The rest is same as the above case. If only one rename fails, the in-memory state won't reflect the on-disk state. Since it stays as RUR, it won't be mixed into the normal block locations. Upon DN restart, the replica will not be loaded. If fails after successful renames, the replica will be loaded and reported as FINALIZED when the DN restarts, at which point it will be an excess replica. No effect on data consistency or durability. - A failure in 7) causes temporary under-replication. Closing right away does not make it any worse than the retry approach. In summary, it is safe to close right away. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370138#comment-16370138 ] Daryn Sharp commented on HDFS-12070: Back when I filed, I played around with a fix and didn't use close=false. I too read the append design. It reads is if the PD is supposed to obtain a new genstamp and retry but I don't think a DN can do that. The reasoning for another round of commit sync wasn't explained. Perhaps it was due to the earlier implementation or concerns over concurrent commit syncs but the recovery id feature should allow the NN to weed out prior commit syncs. My concern is the NN has claimed the lease during commit sync. Append, truncate, and non-overwrite creates will trigger an implicit commit sync. Normally it completes almost immediately, roughly up to the heartbeat interval, and the client succeeds on retry. If another round of commit sync is required due to close=false, the client can re-trigger commit sync after the soft lease period (5 mins) – I don't think a client does or should retry for that long. Which means the operation will unnecessarily fail. Also, it will take up to the hard lease period (1 hour) for the NN to fix the under replication. In either case (close=true/false), the NN has removed the failed DNs from the expected locations. Bad blocks should be invalidated if/when "failed" DNs block report in the wrong genstamp and/or size so I think it's safe for the PD to ignore failed nodes and close? > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363210#comment-16363210 ] Kihwal Lee commented on HDFS-12070: --- {noformat} [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 108.315 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 80.609 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 124.02 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure [INFO] Running org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts [INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.039 s - in org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts [INFO] Running org.apache.hadoop.hdfs.TestMaintenanceState [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 353.62 s - in org.apache.hadoop.hdfs.TestMaintenanceState [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 87, Failures: 0, Errors: 0, Skipped: 10 {noformat} > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363160#comment-16363160 ] Kihwal Lee commented on HDFS-12070: --- The failed (long) tests in precommit are all passing on my machine. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362937#comment-16362937 ] genericqa commented on HDFS-12070: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 32s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 31 unchanged - 0 fixed = 33 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}133m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}181m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | | | hadoop.hdfs.TestMaintenanceState | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12070 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910415/HDFS-12070.0.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2e41f21fedf7 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c5e6e3d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/23049/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/23049/artifact
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362616#comment-16362616 ] Kihwal Lee commented on HDFS-12070: --- Finally got around to add a test case. From the test output, I see a {{commitBlockSynchronization()}} with {{close=false}} being called instead of simply blowing up. After this, the next recovery attempt succeeds. Without the fix, the recovery never succeeds. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kihwal Lee >Priority: Major > Attachments: HDFS-12070.0.patch, lease.patch > > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321016#comment-16321016 ] genericqa commented on HDFS-12070: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}188m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12070 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12905500/lease.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b64fafa48951 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1a09da7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apac
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292994#comment-16292994 ] Kihwal Lee commented on HDFS-12070: --- To complete the history lesson, I traced down when {{closeFile}} was added to {{commitBlockSynchronization()}} and why no one is calling it with {{false}} anymore. It turns out, the {{closeFile}} argument has existed since the dawn of {{commitBlockSynchronization()}}. It was added by HADOOP-3310 to 0.18 in 2008. The old append dependeds on it. Even in this, the normal lease recovery would always call it with {{closeFile == true}}. There was a new {{ClientDatanodeProtocol}} method, {{recoverBlock()}}, which causes {{commitBlockSynchronization()}} to be called with {{closeFile == false}}. I guess this disappeard when {{recoverBlock()}} client command was removed from datanode. Today, a {{recoverLease()}} call to namenode can be used instead. It is really fortunate that the {{closeFile}} option was initially added and has survived for 9 years in spite of lack use. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292929#comment-16292929 ] Kihwal Lee commented on HDFS-12070: --- bq. the PD needs to ... tell the namenode to exclude the failed node from the expected locations. It appears calling {{commitBlockSynchronization()}} with {{closeFile == false}} might do the trick. On the NN size, we could make it do block/lease recovery again soon. The older NNs will still work, but with 1 hour delay until the retry. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12070) Failed block recovery leaves files open indefinitely and at risk for data loss
[ https://issues.apache.org/jira/browse/HDFS-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292829#comment-16292829 ] Kihwal Lee commented on HDFS-12070: --- I have recently encountered two occurrences of this, all involving a faulty drive. In general, if a recovery attempt on a faulty node fails in stage 1, everything will be fine as the node will be excluded in stage 2. If not, there will be perpetual failures and the lease won't be recovered. Presence of a faulty drive tends to cause this issue. - Case 1) During the very first recovery attempt, the rename of meta file succeeded during finalization in the stage 2 but the data file rename failed. This caused perpetual failures in subsequent recovery attempts. - Case 2) The drive containing a replica to be recovered failed and "removed" hours before the first recovery attempt. But the recovery included the node and it still manage to find the replica info to successfully complete the stage 1. The stage 2 fails as the file system is read-only and files cannot be renamed. Hence perpetual recovery failures. Fixing case 1 specifically is easy. The existing limited scope meta/data file existence check in stage 1 can be expanded to all replica state before proceeding. If the meta or the data file is missing, there is no point in declaring a success in stage 1. The node will be excluded in stage 2, so the recovery will succeed in the second attempt. {{\_...\_}} While looking at case 2 though, I realized that it is a much more complicated issue. First of all, why are we failing the entire recovery when a node failed in stage 2? Can't we simply exclude the failed node and commit? The following is the code in stage 2 that causes recovery failure. {code} // If any of the data-nodes failed, the recovery fails, because // we never know the actual state of the replica on failed data-nodes. // The recovery should be started over. if (!failedList.isEmpty()) { throw new IOException("Cannot recover " + block + ", the following datanodes failed: " + failedList); } {code} This isn't in 0.20.205 or 1.x, but is present in 0.21 and later. It took some time (had to go back to svn) to trace down. This was added by HDFS-658 as a part of the "new" append feature. branch-1 had the "old" append. According to the design doc, the recovery should not end there. {panel} 6.5. Block Recovery (...) c. Recover replicas that participated in length agreement in step b.iv. (Ed. stage 1) d. PD (Ed. primary datanode) checks the result of c. If no DataNode succeeds, block recovery fails. * If some succeed and some fail, PD gets a new generation stamp from NN and repeats block recovery with the successful DataNodes.* (...) {panel} The current recovery should fail, but the next recovery should be tried with only the successful ones. This isn't the case (and causes perpetual failures), so we can call this *an incomplete implementation* of the design. To fully conform to the design, the PD needs to be able to initiate a new recovery or tell the namenode to exclude the failed node from the expected locations. Alternatively, the PD can tell the failed node to reject further participation in recovery of the block, thus making it fail in stage 1. However, it might be less reliable as it involves a faulty node. To trigger an immediate retry of recovery (i.e. not 1 hour later), active notification from PD to NN will be necessary. > Failed block recovery leaves files open indefinitely and at risk for data loss > -- > > Key: HDFS-12070 > URL: https://issues.apache.org/jira/browse/HDFS-12070 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp > > Files will remain open indefinitely if block recovery fails which creates a > high risk of data loss. The replication monitor will not replicate these > blocks. > The NN provides the primary node a list of candidate nodes for recovery which > involves a 2-stage process. The primary node removes any candidates that > cannot init replica recovery (essentially alive and knows about the block) to > create a sync list. Stage 2 issues updates to the sync list – _but fails if > any node fails_ unlike the first stage. The NN should be informed of nodes > that did succeed. > Manual recovery will also fail until the problematic node is temporarily > stopped so a connection refused will induce the bad node to be pruned from > the candidates. Recovery succeeds, the lease is released, under replication > is fixed, and block is invalidated from the bad node. -- This message was sent by Atlassian JIRA (v6.4.14#64029) ---