Rong Ma created ARROW-8803: ------------------------------ Summary: [Java] Row count should be set before loading buffers In VectorLoader Key: ARROW-8803 URL: https://issues.apache.org/jira/browse/ARROW-8803 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Rong Ma Fix For: 1.0.0
Hi guys! I'm new to the community, and I've been using Arrow for some time. In my use case, I need to read RecordBatch with *compressed* underlying buffers using Java's IPC API, and I'm finally blocked by the VectorLoader's "load" method. In this method, {quote}{{root.setRowCount(recordBatch.getLength());}} {quote} It not only set the rowCount for the root, but also set the valueCount for the vectors the root holds, *which have already been set once when load buffers.* It's not a bug... I know. But if I try to load some compressed buffers, I will get the following exceptions: {quote}java.lang.IndexOutOfBoundsException: index: 0, length: 512 (expected: range(0, 504)) at io.netty.buffer.ArrowBuf.checkIndex(ArrowBuf.java:718) at io.netty.buffer.ArrowBuf.setBytes(ArrowBuf.java:965) at org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:439) at org.apache.arrow.vector.BaseFixedWidthVector.setValueCount(BaseFixedWidthVector.java:708) at org.apache.arrow.vector.VectorSchemaRoot.setRowCount(VectorSchemaRoot.java:226) at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:61) at org.apache.arrow.vector.ipc.ArrowReader.loadRecordBatch(ArrowReader.java:205) at org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:122) {quote} And I start to think that if it would be more make sense to call root.setRowCount before loadbuffers? In root.setRowCount it also calls each vector's setValueCount, which I think is unnecessary here since the vectors after calling loadbuffers are already formed. Another existing piece of code upstream is similar to this change. [link|https://github.com/apache/arrow/blob/ed1f771dccdde623ce85e212eccb2b573185c461/java/vector/src/main/java/org/apache/arrow/vector/ipc/JsonFileReader.java#L170-L178] -- This message was sent by Atlassian Jira (v8.3.4#803005)