[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055342#comment-17055342
 ] 

Billie Rinaldi commented on MAPREDUCE-7265:
-------------------------------------------

Patch 01 contains one way of addressing this issue. It effectively makes the 
check for whether a spill is needed happen one record earlier. That is, if the 
metadata for k/v pairĀ _N_ will not fit in the buffer, it initiates the spill 
when k/v pairĀ _N-1_ data is written. Another option might be to disallow spill 
percent 1.0 (although I am concerned about the spill initiation code in the 
collect method and am thinking that code path should be avoided).

The patch has some tests that exercise different aspects of the issue, though I 
have not been able to reproduce everything I have seen with custom key/value 
types. In the attached patch, if the change to MapTask is removed, 
testTwoSpillsBytesWritable will crash the test run and you won't be able to see 
the results of the other tests. If that test is also commented out, the other 
tests will produce some ArrayIndexOutOfBoundsExceptions and some failures to 
verify expected map output file contents.

> Buffer corruption with spill percent 1.0
> ----------------------------------------
>
>                 Key: MAPREDUCE-7265
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7265
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>            Priority: Minor
>         Attachments: MAPREDUCE-7265.01.patch
>
>
> I encountered a variety of issues on a cluster where the spill percent was 
> set to 1.0. Under some conditions, MapTask will not detect that its in memory 
> spill buffer is already full and will keep collecting k/v pairs, causing 
> corruption of the buffer.
> I have been able to track at least some of the problems to a condition where 
> adding a key/value pair to the buffer fills the buffer with fewer than 16 
> bytes remaining (the kv metadata size). When this happens, the next metadata 
> index (kvindex) passes over the data index (bufindex), which causes some of 
> the index and length calculations to be incorrect in the collect and write 
> methods. It can allow data to keep being written to the buffer even though it 
> is already full, with data overwriting metadata in the buffer and vice versa. 
> I have seen this manifest as the NegativeArraySizeException seen in 
> MAPREDUCE-6907 as well as in ArrayIndexOutOfBoundsException and EOFException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to