rdblue commented on a change in pull request #3533:
URL: https://github.com/apache/iceberg/pull/3533#discussion_r748886772



##########
File path: 
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java
##########
@@ -407,35 +412,55 @@ public void setBatchSize(int batchSize) {
   }
 
   private static final class PositionVectorReader extends 
VectorizedArrowReader {
+    private final Field arrowField = 
ArrowSchemaUtil.convert(MetadataColumns.ROW_POSITION);
+    private final BufferAllocator bufferAllocator = 
ArrowAllocation.rootAllocator();
+    private final boolean setArrowValidityVector;
     private long rowStart;
+    private int batchSize;
+    private FieldVector vec;
     private NullabilityHolder nulls;
 
+    PositionVectorReader(boolean setArrowValidityVector) {
+      this.setArrowValidityVector = setArrowValidityVector;
+    }
+
     @Override
     public VectorHolder read(VectorHolder reuse, int numValsToRead) {
-      Field arrowField = ArrowSchemaUtil.convert(MetadataColumns.ROW_POSITION);
-      FieldVector vec = 
arrowField.createVector(ArrowAllocation.rootAllocator());
-
-      if (reuse != null) {
-        vec.setValueCount(0);
-        nulls.reset();
+      if (reuse == null) {
+        this.vec = newVector();
+        this.nulls = newNullabilityHolder();
       } else {
-        ((BigIntVector) vec).allocateNew(numValsToRead);
-        for (int i = 0; i < numValsToRead; i += 1) {
-          vec.getDataBuffer().setLong(i * Long.BYTES, rowStart + i);
-        }
-        for (int i = 0; i < numValsToRead; i += 1) {
-          BitVectorHelper.setBit(vec.getValidityBuffer(), i);
+        vec.setValueCount(0);

Review comment:
       There is no guarantee that the instance variable `vec` is the same one 
that backs the reused `VectorHolder`. The method contract is that this will 
fill the reused vector with data, not that it will reuse a vector if the holder 
is non-null. I think that the `else` case here should set the _local_ variable 
`vec` to `reuse.vector()`. Technically, we should probably use the same 
nullability holder as well, but I think that it is okay to return a new 
`VectorHolder` with the constant nullability holder instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to