[ 
https://issues.apache.org/jira/browse/TEZ-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706253#comment-14706253
 ] 

Rajesh Balamohan commented on TEZ-2732:
---------------------------------------

One more place where similar overflow can happen is in write() (bufindex + len 
can get into -ve space). In such cases, it would end up throwing following 
exception
{noformat}
java.lang.ArrayIndexOutOfBoundsException
        at 
org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$Buffer.write(DefaultSorter.java:648)
        at 
org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$Buffer.write(DefaultSorter.java:544)
        at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
        at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
        at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
        at org.apache.hadoop.io.Text.write(Text.java:330)
        at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
        at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
{noformat}

> DefaultSorter throws ArrayIndex exceptions on 2047 Mb size sort buffers
> -----------------------------------------------------------------------
>
>                 Key: TEZ-2732
>                 URL: https://issues.apache.org/jira/browse/TEZ-2732
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2732.1.patch
>
>
> {noformat}
>   kvbuffer.length = 2146435072 (2047 MB)
>   Corner case: bufIndex=2026133899, kvbidx=523629312.
>   distkvi = mod - i + j = 2146435072 - 2026133899 + 523629312 = 643930485
>   newPos = (2026133899 + (max(.., min(643930485/2, 271128624))) (This would 
> overflow)
> {noformat}
> Would be good to restrict the max allowed sort buffer to 1800 instead of 
> 2047. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to