RussellSpitzer commented on a change in pull request #3557:
URL: https://github.com/apache/iceberg/pull/3557#discussion_r766041721
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java
##########
@@ -78,63 +76,13 @@ public final ColumnarBatch read(ColumnarBatch reuse, int
numRowsToRead) {
"Number of rows in the vector %s didn't match expected %s ",
numRowsInVector,
numRowsToRead);
- if (rowIdMapping == null) {
- arrowColumnVectors[i] =
IcebergArrowColumnVector.forHolder(vectorHolders[i], numRowsInVector);
- } else {
- int[] rowIdMap = rowIdMapping.first();
- Integer numRows = rowIdMapping.second();
- arrowColumnVectors[i] =
ColumnVectorWithFilter.forHolder(vectorHolders[i], rowIdMap, numRows);
- }
+ arrowColumnVectors[i] = batch.hasDeletes() ?
Review comment:
From what I can tell, our main reason we want to divide this is that
ColumnarBatchReader can't build this entire object in its constructor. So It's
probably fine to keep this as an inner class of ColumnarBatchReader. I just
want to keep as much state in a final constructor as possible. Maybe that's a
bit weird so if you have other feelings let me know. But basically I was hoping
all of the more complicated state gets created inside an immutable class
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]