[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3557: Spark: Support vectorized reads with equality deletes

GitBox Thu, 09 Dec 2021 10:18:27 -0800


RussellSpitzer commented on a change in pull request #3557:
URL: https://github.com/apache/iceberg/pull/3557#discussion_r766041721




##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java
##########
@@ -78,63 +76,13 @@ public final ColumnarBatch read(ColumnarBatch reuse, int 
numRowsToRead) {
           "Number of rows in the vector %s didn't match expected %s ", 
numRowsInVector,
           numRowsToRead);
 
-      if (rowIdMapping == null) {
-        arrowColumnVectors[i] = 
IcebergArrowColumnVector.forHolder(vectorHolders[i], numRowsInVector);
-      } else {
-        int[] rowIdMap = rowIdMapping.first();
-        Integer numRows = rowIdMapping.second();
-        arrowColumnVectors[i] = 
ColumnVectorWithFilter.forHolder(vectorHolders[i], rowIdMap, numRows);
-      }
+      arrowColumnVectors[i] = batch.hasDeletes() ?

Review comment:
       From what I can tell, our main reason we want to divide this is that 
ColumnarBatchReader can't build this entire object in its constructor. So It's 
probably fine to keep this as an inner class of ColumnarBatchReader.  I just 
want to keep as much state in a final  constructor as possible. Maybe that's a 
bit weird so if you have other feelings let me know. But basically I was hoping 
all of the more complicated state gets created inside an immutable class




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3557: Spark: Support vectorized reads with equality deletes

Reply via email to