wotbrew opened a new issue, #38242:
URL: https://github.com/apache/arrow/issues/38242
### Describe the bug, including details regarding any error messages,
version, and platform.
`DenseUnionVector.getBufferSizeFor` takes a count parameter. My expectation
is the count represents the number of elements in the union you wish to account
for.
However, that count is passed directly to `internalStruct.getBufferSizeFor`,
which I suspect is a bug.
This is because normally struct fields have the same valueCount, but this is
not likely to be true when used in a union. If your legs have different lengths:
- for fixed vectors, you will calculate the size of the leg contents
incorrectly
- dynamic vectors like VarBinary may require reading buffer contents to find
the data size, potentially causing an out-of-bounds dereference
Observed on `13.0.0` and `12.0.1`.
Repro:
```java
package xtdb.util
import org.apache.arrow.memory.BoundsChecking;
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.BaseValueVector;
import org.apache.arrow.vector.complex.DenseUnionVector;
import org.apache.arrow.vector.types.UnionMode;
import org.apache.arrow.vector.types.pojo.ArrowType;
import org.apache.arrow.vector.types.pojo.Field;
import org.apache.arrow.vector.types.pojo.FieldType;
import org.junit.jupiter.api.Test;
import java.util.Arrays;
import static org.junit.jupiter.api.Assertions.*;
public class DUVBufferSizeTest {
@Test
public void testBufferSize() {
try (var allocator = new RootAllocator();
var duv = new DenseUnionVector("duv", allocator,
FieldType.nullable(new ArrowType.Union(UnionMode.Dense, null)), null)) {
var fields = Arrays.asList(
new Field("a", FieldType.notNullable(new
ArrowType.Int(32, true)), null),
new Field("b", FieldType.notNullable(new
ArrowType.Binary()), null)
);
duv.initializeChildrenFromFields(fields);
byte atid = 0;
byte btid = 1;
var a = duv.getIntVector(atid);
var b = duv.getVarBinaryVector(btid);
int ac = BaseValueVector.INITIAL_VALUE_ALLOCATION+1;
for (int i = 0; i < ac; i++) {
a.setSafe(i, 1);
duv.setTypeId(i, atid);
duv.setOffset(i, i);
}
int bc = 1;
for (int i = 0; i < bc; i++) {
b.setSafe(i, new byte[0]);
duv.setTypeId(i+ac, btid);
duv.setOffset(i+ac, i);
}
duv.setValueCount(ac+bc);
// will not necessarily see an error unless bounds checking is
on.
assertTrue(BoundsChecking.BOUNDS_CHECKING_ENABLED);
assertDoesNotThrow(duv::getBufferSize);
}
}
}
```
### Component(s)
Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]