[ 
https://issues.apache.org/jira/browse/ARROW-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-1547:
----------------------------------
    Labels: pull-request-available  (was: )

> [JAVA] Fix 8x memory over-allocation in BitVector
> -------------------------------------------------
>
>                 Key: ARROW-1547
>                 URL: https://issues.apache.org/jira/browse/ARROW-1547
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java - Vectors
>            Reporter: Siddharth Teotia
>            Assignee: Siddharth Teotia
>              Labels: pull-request-available
>
> Typically there are 3 ways of specifying the amount of memory needed for 
> vectors.
> CASE (1) allocateNew() -- here the application doesn't really specify the 
> size of memory or value count. Each vector type has a default value count 
> (4096) and therefore a default size (in bytes) is used in such cases.
> For example, for a 4 byte fixed-width vector, we will allocate 32KB of memory 
> for a call to allocateNew().
> CASE (2) setInitialCapacity(count) followed by allocateNew() - In this case 
> also the application doesn't specify the value count or size in 
> allocateNew(). However, the call to setInitialCapacity() dictates the amount 
> of memory the subsequent call to allocateNew() will allocate.
> For example, we can do setInitialCapacity(1024) and the call to allocateNew() 
> will allocate 4KB of memory for the 4 byte fixed-width vector.
> CASE (3) allocateNew(count) - The application is specific about requirements.
> For nullable vectors, the above calls also allocate the memory for validity 
> vector.
> The problem is that Bit Vector uses a default memory size in bytes of 4096. 
> In other words, we allocate a vector for 4096*8 value count.
> In the default case (as explained above), the vector types have a value count 
> of 4096 so we need only 4096 bits (512 bytes) in the bit vector and not 
> really 4096 as the size in bytes.
> This happens in CASE 1 where the application depends on the default memory 
> allocation . In such cases, the size of buffer for bit vector is 8x than 
> actually needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to