flyrain commented on code in PR #4888:
URL: https://github.com/apache/iceberg/pull/4888#discussion_r900501091


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java:
##########
@@ -209,7 +231,32 @@ void applyEqDelete() {
         rowId++;
       }
 
-      columnarBatch.setNumRows(currentRowId);
+      newColumnarBatch.setNumRows(currentRowId);
+      return currentRowId;
+    }
+
+    /**
+     * Convert the row id mapping array to the isDeleted array.
+     *
+     * @param numRowsInRowIdMapping the num of rows in the row id mapping array
+     */
+    void rowIdMappingToIsDeleted(int numRowsInRowIdMapping) {
+      if (isDeleted == null || rowIdMapping == null) {
+        return;
+      }
+
+      for (int i = 0; i < numRowsToRead; i++) {
+        isDeleted[i] = true;
+      }
+
+      for (int i = 0; i < numRowsInRowIdMapping; i++) {
+        isDeleted[rowIdMapping[i]] = false;
+      }
+
+      // reset the row id mapping array, so that it doesn't filter out the 
deleted rows
+      for (int i = 0; i < numRowsToRead; i++) {

Review Comment:
   We've set the ColumnVectorWithFilter with rowIdMapping at line 133. The 
instance of ColumnVectorWithFilter will throw NPE if we set rowIdMapping to 
null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to