[ 
https://issues.apache.org/jira/browse/HADOOP-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wendy Chien updated HADOOP-731:
-------------------------------

    Attachment: hadoop-731-7.patch

Attached a patch which allows us to continue reading after getting a checksum 
error by modifying Checker.read to catch ChecksumExceptions thrown by 
verifySum.  

In Checker.read, if we get a ChecksumException, we seek to a new datanode for 
both the data stream and the checksum stream (when using dfs, this is a no op 
for other fs).  If at least one of the datanodes is different from before, 
we'll retry the read.  

In DFSInputStream, added a new seek method which also requests a datanode other 
than the current node.

 

> Sometimes when a dfs file is accessed and one copy has a checksum error the 
> I/O command fails, even if another copy is alright.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-731
>                 URL: https://issues.apache.org/jira/browse/HADOOP-731
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.7.2
>            Reporter: Dick King
>         Assigned To: Wendy Chien
>         Attachments: hadoop-731-7.patch
>
>
> for a particular file [alas, the file no longer exists -- I had to progress]  
>     $dfs -cp foo bar        
> and
>     $dfs -get foo local
> failed on a checksum error.  The dfs browser's download function retrieved 
> the file, so either that function doesn't check, or more likely the download 
> function got a different copy.
> When a checksum fails on one copy of a file that is redundantly stored, I 
> would prefer that dfs try a different copy, mark the bad one as not existing 
> [which should induce a fresh copy being made from one of the good copies 
> eventually], and make the call continue to work and deliver bytes.
> Ideally, if all copies have checksum errors but it's possible to piece 
> together a good copy I would like that to be done.
> -dk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to