[ 
https://issues.apache.org/jira/browse/ARROW-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan updated ARROW-5429:
----------------------------
    Description: 
The current buffer allocation policy works like this:
 * If the requested buffer size is greater than or equal to the chunk size, the 
buffer size will be as is.
 * If the requested size is within the chunk size, the buffer size will be 
rounded to the next power of 2.

This policy can lead to waste of memory in some cases. For example, if we 
request a buffer of size 10MB, Arrow will round the buffer size to 16 MB. If we 
only need 10 MB, this will lead to a waste of (16 - 10) / 10 = 60% of memory.

So in this proposal, we provide another policy: the rounded buffer size must be 
a multiple of some memory unit, like (32 KB). This policy has two benefits:
 # The wasted memory cannot exceed one memory unit (32 KB), which is much 
smaller than the power-of-two policy.
 # This is the memory allocation policy adopted by some computation engines 
(e.g. Apache Flink). 

  was:
The current buffer allocation policy works like this:
 * If the requested buffer size is greater than or equal to the chunk size, the 
buffer size will be as is.
 * If the requested size is within the chunk size, the buffer size will be 
rounded to the next power of 2.

This policy can lead to waste of memory in some cases. For example, if we 
request a buffer of size 10MB, Arrow will round the buffer size to 16 MB. If we 
only need 10 MB, this will lead to a waste of (16 - 10) / 10 = 60% of memory.

So in this proposal, we provide another policy: the rounded buffer size must be 
a multiple of some memory unit, like (32 KB). This policy has two benefits:
 # The wasted memory cannot exceed one memory unit (32 KB), which is much 
smaller than the power-of-two policy.
 # This is the policy matches the memory allocation policies for some 
computation engines (e.g. Apache Flink). 


> Provide alternative buffer allocation policy
> --------------------------------------------
>
>                 Key: ARROW-5429
>                 URL: https://issues.apache.org/jira/browse/ARROW-5429
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>
> The current buffer allocation policy works like this:
>  * If the requested buffer size is greater than or equal to the chunk size, 
> the buffer size will be as is.
>  * If the requested size is within the chunk size, the buffer size will be 
> rounded to the next power of 2.
> This policy can lead to waste of memory in some cases. For example, if we 
> request a buffer of size 10MB, Arrow will round the buffer size to 16 MB. If 
> we only need 10 MB, this will lead to a waste of (16 - 10) / 10 = 60% of 
> memory.
> So in this proposal, we provide another policy: the rounded buffer size must 
> be a multiple of some memory unit, like (32 KB). This policy has two benefits:
>  # The wasted memory cannot exceed one memory unit (32 KB), which is much 
> smaller than the power-of-two policy.
>  # This is the memory allocation policy adopted by some computation engines 
> (e.g. Apache Flink). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to