[
https://issues.apache.org/jira/browse/TEZ-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047915#comment-16047915
]
Travis Woodruff commented on TEZ-3757:
--------------------------------------
[~rajesh.balamohan] - This is causing failures. I'm able to debug this on a
local run, and I get the following behavior in the {{SortSpan}} constructor:
SortSpan constructor paramters:
* source = {{java.nio.HeapByteBuffer (pos=0 lim=11534336 cap=11534336)}}
* maxItems = {{1048576}}
* perItem = {{2185}}
* comparator = ...
Tracing through the code:
{code}
capacity = source.remaining(); // capacity = 11534336
int metasize = METASIZE*maxItems; // metasize = 16777216
int dataSize = maxItems * perItem; // dataSize = -2003828736
***OVERFLOW***
if(capacity < (metasize+dataSize)) { // Does not match because
(11534336 > (16777216 + -2003828736))
// try to allocate less meta space, because we have sample data
metasize = METASIZE*(capacity/(perItem+METASIZE));
}
ByteBuffer reserved = source.duplicate(); // reserved =
java.nio.HeapByteBuffer[pos=0 lim=11534336 cap=11534336]
reserved.mark();
LOG.info(outputContext.getDestinationVertexName() + ": " +
"reserved.remaining()=" +
reserved.remaining() + ", reserved.metasize=" + metasize);
// Log:
reserved.remaining()=11534336, reserved.metasize=16777216
reserved.position(metasize); // IllegalArgumentException
because metasize (16777216) > reserved.limit (11534336)
{code}
> Integer overflow in PipelinedSorter
> -----------------------------------
>
> Key: TEZ-3757
> URL: https://issues.apache.org/jira/browse/TEZ-3757
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.8.4, 0.8.5
> Reporter: Travis Woodruff
>
> This code in {{PipelinedSorter.sort()}} passes {{(1024*1024)}} as maxItems to
> the {{SortSpan}} constructor:
> {code}
> //TODO: fix per item being passed.
> span = new SortSpan((ByteBuffer)buffers.get(bufferIndex).clear(),
> (1024*1024),
> perItem, ConfigUtils.getIntermediateOutputKeyComparator(this.conf));
> {code}
> {{SortSpan}}'s constructor then calculates {{dataSize}} as follows:
> {code}
> int dataSize = maxItems * perItem;
> {code}
> This means that if {{perItem}} is >= 2048, {{dataSize}} overflows, which
> (usually?) ends up causing the capacity check to not work correctly, which
> causes subsequent buffer operations to fail.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)