[ 
https://issues.apache.org/jira/browse/ARROW-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Khamesra updated ARROW-5057:
-----------------------------------
    Description: 
[Class 
BaseValueVector|https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java]
 's method allocFixedDataAndValidityBufs on line#162 allocates a buffer in 
power of 2 size. After that, it has a code to release extra buffer. For that, 
it calculates the extra buffer from allocated size "bufferSize" but in my 
opinion, it should take original "valueCount" to find the extra buffer size. 

Here, I see a problem in line#162, where its taking "bufferSize" to find the 
extra allocated bytes. It should be "valueCount*typeWidth + valueCount/8".

Here is an example for that. Let's take 1000 ints. Then,
 valueCount = 1000 ints
 typeWidth = 4 bytes
 validitiyBufferSize = 125 bytes
 valueBufferSize = 4000 bytes
 combinedSize(valueBufferSize + validityBufferSize) = 4128 bytes (multiple of 8)
 combinedSizeWith2ThePowerSize = 8192 bytes, this will be "bufferSize" at 
line#152.

With the above calculation, this code should release 
(combinedSizeWith2ThePowerSize - combinedSize) = 4064 bytes. But, this is not 
happening.

 

 

  was:
[Class 
BaseValueVector|https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java]
 's method allocFixedDataAndValidityBufs on line#162 allocates a buffer in 
power of 2 size. After that, it has a code release extra buffer. For that, it 
calculates the extra buffer from allocated size "bufferSize" but in my opinion, 
it should take original "valueCount" to find the extra buffer size. 

Here, I see a problem in line#162, where its taking "bufferSize" to find the 
extra allocated bytes. It should be "valueCount*typeWidth + valueCount/8".

Here is an example for that. Let's take 1000 ints. Then,
valueCount = 1000 ints
typWidth = 4 bytes
validitiyBufferSize = 125 bytes
valueBufferSize = 4000 bytes
combinedSize(valueBufferSize + validityBufferSize) = 4128 bytes (multiple of 8)
combinedSizeWith2ThePowerSize = 8192 bytes, this will be "bufferSize" at 
line#152.

With the above calculation, this code should release 
(combinedSizeWith2ThePowerSize - combinedSize) = 4064 bytes. But, this is not 
happening.

 

 


> Java:  allocate new buffer code doesn't release extra allocated buffer 
> properly
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-5057
>                 URL: https://issues.apache.org/jira/browse/ARROW-5057
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 0.12.1
>            Reporter: Hitesh Khamesra
>            Priority: Major
>
> [Class 
> BaseValueVector|https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java]
>  's method allocFixedDataAndValidityBufs on line#162 allocates a buffer in 
> power of 2 size. After that, it has a code to release extra buffer. For that, 
> it calculates the extra buffer from allocated size "bufferSize" but in my 
> opinion, it should take original "valueCount" to find the extra buffer size. 
> Here, I see a problem in line#162, where its taking "bufferSize" to find the 
> extra allocated bytes. It should be "valueCount*typeWidth + valueCount/8".
> Here is an example for that. Let's take 1000 ints. Then,
>  valueCount = 1000 ints
>  typeWidth = 4 bytes
>  validitiyBufferSize = 125 bytes
>  valueBufferSize = 4000 bytes
>  combinedSize(valueBufferSize + validityBufferSize) = 4128 bytes (multiple of 
> 8)
>  combinedSizeWith2ThePowerSize = 8192 bytes, this will be "bufferSize" at 
> line#152.
> With the above calculation, this code should release 
> (combinedSizeWith2ThePowerSize - combinedSize) = 4064 bytes. But, this is not 
> happening.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to