[jira] Updated: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length

Chris Douglas (JIRA) Wed, 11 Mar 2009 01:15:18 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chris Douglas updated HADOOP-5459:
----------------------------------

    Assignee: Chris Douglas
      Status: Patch Available  (was: Open)

> CRC errors not detected reading intermediate output into memory with 
> problematic length
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5459
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5459
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>            Priority: Blocker
>         Attachments: 5459-0.patch, 5459-1.patch
>
>
> It's possible that the expected, uncompressed length of the segment is less 
> than the available/decompressed data. This can happen in some worst-cases for 
> compression, but it is exceedingly rare. It is also possible (though also 
> fantastically unlikely) for the data to deflate to a size greater than that 
> reported by the map. CRC errors will remain undetected because 
> IFileInputStream does not validate the checksum until the end of the stream, 
> and close() does not advance the stream to the end of the segment. The 
> (abbreviated) read loop fetching data in shuffleInMemory:
> {code}
> int n = input.read(shuffleData, 0, shuffleData.length);
> while (n > 0) { 
>   bytesRead += n;
>   n = input.read(shuffleData, bytesRead, 
>                  (shuffleData.length-bytesRead));
> } 
> {code}
> Will read only up to the expected length. Without reading the whole segment, 
> the checksum is not validated. Even if IFileInputStream instances are closed, 
> they should always validate checksums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length

Reply via email to