[
https://issues.apache.org/jira/browse/ARROW-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-1547:
----------------------------------
Labels: pull-request-available (was: )
> [JAVA] Fix 8x memory over-allocation in BitVector
> -------------------------------------------------
>
> Key: ARROW-1547
> URL: https://issues.apache.org/jira/browse/ARROW-1547
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java - Vectors
> Reporter: Siddharth Teotia
> Assignee: Siddharth Teotia
> Labels: pull-request-available
>
> Typically there are 3 ways of specifying the amount of memory needed for
> vectors.
> CASE (1) allocateNew() -- here the application doesn't really specify the
> size of memory or value count. Each vector type has a default value count
> (4096) and therefore a default size (in bytes) is used in such cases.
> For example, for a 4 byte fixed-width vector, we will allocate 32KB of memory
> for a call to allocateNew().
> CASE (2) setInitialCapacity(count) followed by allocateNew() - In this case
> also the application doesn't specify the value count or size in
> allocateNew(). However, the call to setInitialCapacity() dictates the amount
> of memory the subsequent call to allocateNew() will allocate.
> For example, we can do setInitialCapacity(1024) and the call to allocateNew()
> will allocate 4KB of memory for the 4 byte fixed-width vector.
> CASE (3) allocateNew(count) - The application is specific about requirements.
> For nullable vectors, the above calls also allocate the memory for validity
> vector.
> The problem is that Bit Vector uses a default memory size in bytes of 4096.
> In other words, we allocate a vector for 4096*8 value count.
> In the default case (as explained above), the vector types have a value count
> of 4096 so we need only 4096 bits (512 bytes) in the bit vector and not
> really 4096 as the size in bytes.
> This happens in CASE 1 where the application depends on the default memory
> allocation . In such cases, the size of buffer for bit vector is 8x than
> actually needed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)