flyrain commented on code in PR #4683:
URL: https://github.com/apache/iceberg/pull/4683#discussion_r875368116
##########
core/src/main/java/org/apache/iceberg/deletes/Deletes.java:
##########
@@ -63,14 +65,14 @@ public static <T> CloseableIterable<T>
filter(CloseableIterable<T> rows, Functio
return equalityFilter.filter(rows);
}
- public static <T> CloseableIterable<T> filter(CloseableIterable<T> rows,
Function<T, Long> rowToPosition,
- PositionDeleteIndex deleteSet)
{
- if (deleteSet.isEmpty()) {
- return rows;
- }
-
- PositionSetDeleteFilter<T> filter = new
PositionSetDeleteFilter<>(rowToPosition, deleteSet);
- return filter.filter(rows);
+ public static <T> CloseableIterable<T> markDeleted(CloseableIterable<T>
rows, Predicate<T> isDeleted,
+ Consumer<T> deleteMarker)
{
+ return CloseableIterable.transform(rows, row -> {
+ if (isDeleted.test(row)) {
Review Comment:
My point is we have to check each row inevitably for non-vectorized read.
The vectorized read is a different story. For example, we can generate the
isDelete boolean array directly from the pos deletes.
Of course there is no way to avoid per-row check for eq deletes, neither
with vectorized read, nor non-vectorized read.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]