[ 
https://issues.apache.org/jira/browse/TEZ-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093815#comment-16093815
 ] 

Rohini Palaniswamy commented on TEZ-3212:
-----------------------------------------

Based on 
https://stackoverflow.com/questions/3038392/do-java-arrays-have-a-maximum-size 
we might want to set Integer.MAX_VALUE - 8 as max array size to support more 
jvms.

bq. In such instances (i.e 2 GB), should valBytes allocation be restricted only 
to current key/value length? (i.e Math.min(currentKeyLength, MAX_ARRAY_LENGTH)
  I like this idea as it might save lot of space. But instead of doing 
Math.min, we should throw an error if data size exceeds the max array size 
limit. If we don't do that we might be losing data. i.e

{code}
if (currentValueLength > MAX_ARRAY_LENGTH) {
        throw new NegativeArraySizeException("Size of data " + 
currentValueLength + " is greater than java maximum byte array size");
}
int newLength = currentValueLength << 1;
if (newLength < 0) {
    newLength = currentValueLength;
}
{code}

> IFile throws NegativeArraySizeException for value sizes between 1GB and 2GB
> ---------------------------------------------------------------------------
>
>                 Key: TEZ-3212
>                 URL: https://issues.apache.org/jira/browse/TEZ-3212
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3212.1.patch
>
>
> This is not a regression with respect to MR, just an issue that was 
> encountered with a job whose IFile record values (which can be of max size 
> 2GB) which can be successfully written but not successfully read.
> Failure while running task:java.lang.NegativeArraySizeException
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawValue(IFile.java:765)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to