[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11160: -- Fix Version/s: (was: 2.9.0) > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Fix Version/s: 2.8.0 > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11160: -- Fix Version/s: (was: 2.8.0) 2.9.0 > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Release Note: Fixed a race condition that caused VolumeScanner to recognize a good replica as a bad one if the replica is also being written concurrently. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.7.4 2.8.0 Status: Resolved (was: Patch Available) Committed the patch to branch-2.7, 2.8 , branch-2 and trunk. Much thanks to [~kihwal] [~yzhangal] and [~xiaochen] for multiple rounds of reviews! > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.branch-2.patch Attach branch-2 patch. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.008.patch Oops. my bad, forgot to rebase. Here's v008 > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: (was: HDFS-11160.branch-2.patch) > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.007.patch Thanks [~xiaochen] Yes indeed looks like I can remove the redundancy in test code. Submit patch v007 for precommit check. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.006.patch v006 patch. throw ioexception, instead of returning null if it can't read checksum from meta file. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.005.patch Submit v005 patch after rebase. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.004.patch The timeout in TimeReplication was related to this patch (both trunk/branch-2). It intentionally truncated and extended the raw block file size. Updated the patch to make DataNode handle this error better. Also, I caught one potential bug in the code (actually, the bug was committed in HDFS-11056 by myself) where DN would read metafile without closing it. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.branch-2.patch, > HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.branch-2.patch Thanks [~kihwal] for the review! I am posting branch-2 patch for precommit check. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-11160: - Attachment: HDFS-11160.003.patch Hi [~jojochuang], Thanks for your work here. I did a review of your patch here. While the optimization discussion is still ongoing, I focused on the implementation. I think it's not good to let BlockSender be aware of FsVolumeImpl, because it seems an abstraction violation here. I changed the implementation to address this and uploaded patch rev 003. Basically I think we can have a similar API in FinalizedReplica as in RBW replica to get the last partial checksum. A possible optimization is not to do this when the visibleLength is at chunk boundary (I have not added this change). I did not go through the test code yet. Please take a look at what I changed, hope it makes sense to you. Thanks. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Description: Due to a race condition initially reported in HDFS-6804, VolumeScanner may erroneously detect good replicas as corrupt. This is serious because in some cases it results in data loss if all replicas are declared corrupt. This bug is especially prominent when there are a lot of append requests via HttpFs/WebHDFS. We are investigating an incidence that caused very high block corruption rate in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. However, after applying HDFS-11056, we are still seeing VolumeScanner reporting corrupt replicas. It turns out that if a replica is being appended while VolumeScanner is scanning it, VolumeScanner may use the new checksum to compare against old data, causing checksum mismatch. I have a unit test to reproduce the error. Will attach later. A quick and simple fix is to hold FsDatasetImpl lock and read from disk the checksum. was: Due to a race condition initially reported in HDFS-6804, VolumeScanner may erroneously detect good replicas as corrupt. This is serious because in some cases it results in data loss if all replicas are declared corrupt. We are investigating an incidence that caused very high block corruption rate in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. However, after applying HDFS-11056, we are still seeing VolumeScanner reporting corrupt replicas. It turns out that if a replica is being appended while VolumeScanner is scanning it, VolumeScanner may use the new checksum to compare against old data, causing checksum mismatch. I have a unit test to reproduce the error. Will attach later. A quick and simple fix is to hold FsDatasetImpl lock and read from disk the checksum. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Description: Due to a race condition initially reported in HDFS-6804, VolumeScanner may erroneously detect good replicas as corrupt. This is serious because in some cases it results in data loss if all replicas are declared corrupt. We are investigating an incidence that caused very high block corruption rate in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. However, after applying HDFS-11056, we are still seeing VolumeScanner reporting corrupt replicas. It turns out that if a replica is being appended while VolumeScanner is scanning it, VolumeScanner may use the new checksum to compare against old data, causing checksum mismatch. I have a unit test to reproduce the error. Will attach later. A quick and simple fix is to hold FsDatasetImpl lock and read from disk the checksum. was: Due to a race condition initially reported in HDFS-6804, VolumeScanner may erroneously detect good replicas as corrupt. This is serious because in some cases it results in data loss if all replicas are declared corrupt. We are investigating an incidence that caused very high block corruption rate in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. However, after applying HDFS-11056, we are still seeing VolumeScanner reporting corrupt replicas. It turns out that if a replica is being appended while VolumeScanner is scanning it, VolumeScanner may use the new checksum to compare against old data, causing checksum mismatch. I have a unit test to reproduce the error. Will attach later. To fix it, I propose a FinalizedReplica object should also have a lastChecksum field like ReplicaBeingWritten, and BlockSender should use the in-memory lastChecksum to verify the partial data in the last chunk on disk. File this jira to discuss a good fix for this issue. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.002.patch Submit 002 patch to fix test failures. > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. > To fix it, I propose a FinalizedReplica object should also have a > lastChecksum field like ReplicaBeingWritten, and BlockSender should use the > in-memory lastChecksum to verify the partial data in the last chunk on disk. > File this jira to discuss a good fix for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Status: Patch Available (was: Open) > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. > To fix it, I propose a FinalizedReplica object should also have a > lastChecksum field like ReplicaBeingWritten, and BlockSender should use the > in-memory lastChecksum to verify the partial data in the last chunk on disk. > File this jira to discuss a good fix for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Attachment: HDFS-11160.001.patch Attach my simple fix in v001 patch. In v001 fix, BlockSender constructor pre-loads last partial checksum from on-disk replica if it is a finalized replica. This is simpler than adding a new field in FinalizedReplica class and maintain the value of the field throughout the lifetime of the replica, at the cost of potentially more disk access (because each BlockSender instantiation needs to reload checksum again, regardless whether the replica is updated or not). I verified the unit test passed with this simple fix, and fails without the fix. Appreciate any comments! > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.001.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. > To fix it, I propose a FinalizedReplica object should also have a > lastChecksum field like ReplicaBeingWritten, and BlockSender should use the > in-memory lastChecksum to verify the partial data in the last chunk on disk. > File this jira to discuss a good fix for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11160: --- Summary: VolumeScanner reports write-in-progress replicas as corrupt incorrectly (was: VolumeScanner incorrectly reports good replicas as corrupt due to race condition) > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. > To fix it, I propose a FinalizedReplica object should also have a > lastChecksum field like ReplicaBeingWritten, and BlockSender should use the > in-memory lastChecksum to verify the partial data in the last chunk on disk. > File this jira to discuss a good fix for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org