[jira] Commented: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length

Hudson (JIRA) Fri, 03 Apr 2009 08:25:37 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695439#action_12695439
 ]


Hudson commented on HADOOP-5459:
--------------------------------

Integrated in Hadoop-trunk #796 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/796/])
    

> CRC errors not detected reading intermediate output into memory with 
> problematic length
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5459
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5459
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>             Fix For: 0.20.0
>
>         Attachments: 5459-0.patch, 5459-1.patch
>
>
> It's possible that the expected, uncompressed length of the segment is less 
> than the available/decompressed data. This can happen in some worst-cases for 
> compression, but it is exceedingly rare. It is also possible (though also 
> fantastically unlikely) for the data to deflate to a size greater than that 
> reported by the map. CRC errors will remain undetected because 
> IFileInputStream does not validate the checksum until the end of the stream, 
> and close() does not advance the stream to the end of the segment. The 
> (abbreviated) read loop fetching data in shuffleInMemory:
> {code}
> int n = input.read(shuffleData, 0, shuffleData.length);
> while (n > 0) { 
>   bytesRead += n;
>   n = input.read(shuffleData, bytesRead, 
>                  (shuffleData.length-bytesRead));
> } 
> {code}
> Will read only up to the expected length. Without reading the whole segment, 
> the checksum is not validated. Even if IFileInputStream instances are closed, 
> they should always validate checksums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length

Reply via email to