flyrain commented on code in PR #4683:
URL: https://github.com/apache/iceberg/pull/4683#discussion_r875368116
##########
core/src/main/java/org/apache/iceberg/deletes/Deletes.java:
##########
@@ -63,14 +65,14 @@ public static <T> CloseableIterable<T>
filter(CloseableIterable<T> rows, Functio
return equalityFilter.filter(rows);
}
- public static <T> CloseableIterable<T> filter(CloseableIterable<T> rows,
Function<T, Long> rowToPosition,
- PositionDeleteIndex deleteSet)
{
- if (deleteSet.isEmpty()) {
- return rows;
- }
-
- PositionSetDeleteFilter<T> filter = new
PositionSetDeleteFilter<>(rowToPosition, deleteSet);
- return filter.filter(rows);
+ public static <T> CloseableIterable<T> markDeleted(CloseableIterable<T>
rows, Predicate<T> isDeleted,
+ Consumer<T> deleteMarker)
{
+ return CloseableIterable.transform(rows, row -> {
+ if (isDeleted.test(row)) {
Review Comment:
My point is that we have to check each row inevitably for non-vectorized
read. The vectorized read is a different story though. For example, we can
generate the isDelete boolean array directly from the pos deletes.
Of course there is no way to avoid per-row check for eq deletes, neither
with vectorized read, nor non-vectorized read.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]