[ 
https://issues.apache.org/jira/browse/ARROW-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-5259:
----------------------------------
    Summary: [Java] Add option for ValueVector to allocate buffers with actual 
size  (was: Add option for ValueVector to allocate buffers with actual size)

> [Java] Add option for ValueVector to allocate buffers with actual size
> ----------------------------------------------------------------------
>
>                 Key: ARROW-5259
>                 URL: https://issues.apache.org/jira/browse/ARROW-5259
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: Java
>            Reporter: Ji Liu
>            Assignee: Ji Liu
>            Priority: Minor
>
> Currently in _BaseValueVector#computeCombinedBufferSize_, it calculates the 
> buffer size with _valueCount_ and _typeWidth_ as inputs and then allocates 
> memory for dataBuffer and validityBuffer. However, it always allocate memory 
> greater than the actual size, because of the invoke of 
> _BaseAllocator.nextPowerOfTwo(bufferSize)_.
> For example, IntVector will allocate buffers with size 8192 with valueCount = 
> 1025, memory usage is almost double what it actually is. So in some cases, 
> there have enough memory for actual use but throws OOM when the allocated 
> memory is increased to next power of 2 and I think this problem is absolutely 
> avoidable.
> Is it feasible to add option for ValueVector to allocate actual buffer size 
> rather than make it next power of 2 to reduce memory allocation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to