[ 
https://issues.apache.org/jira/browse/ARROW-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723023#comment-15723023
 ] 

Julien Le Dem commented on ARROW-399:
-------------------------------------

We can change the behavior on the java side and at a minimum not infer sizes 
that don't match the metadata.

Although as a separate discussion we can pad buffers without changing their 
size. The Metadata can still reflect the size of the buffer that we actually 
use and we leave unused space in between buffers or make sure the buffers start 
on the appropriately aligned address.

> [Java] ListVector.loadFieldBuffers ignores the ArrowFieldNode length metadata
> -----------------------------------------------------------------------------
>
>                 Key: ARROW-399
>                 URL: https://issues.apache.org/jira/browse/ARROW-399
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java - Vectors
>            Reporter: Wes McKinney
>            Assignee: Julien Le Dem
>            Priority: Blocker
>         Attachments: list_error.json
>
>
> Discovered this during integration testing. Because Arrow-C++ writes buffers 
> padded to 64 bytes, they may appear larger to the Java library than they need 
> to be. In ListVector.loadFieldBuffers, the ArrowFieldNode is never used:
> {code:language=java}
>   @Override
>   public void loadFieldBuffers(ArrowFieldNode fieldNode, List<ArrowBuf> 
> ownBuffers) {
>     BaseDataValueVector.load(getFieldInnerVectors(), ownBuffers);
>   }
> {code}
> The value count of the resulting ListVector is thus inferred from the size of 
> the offsets buffer. In the case of a length-7 vector in C++, the size of the 
> offsets buffer is exactly 64 bytes (padding for SIMD) -- Java infers from 64 
> bytes that the value count is 15 (64 / 4 - 1), and the integration test fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to