RussellSpitzer commented on code in PR #4738:
URL: https://github.com/apache/iceberg/pull/4738#discussion_r882202951
##########
core/src/main/java/org/apache/iceberg/actions/RewriteFileGroup.java:
##########
@@ -60,6 +61,18 @@ public Set<DataFile> rewrittenFiles() {
return
fileScans().stream().map(FileScanTask::file).collect(Collectors.toSet());
}
+ public int rewrittenEqDeletes() {
+ return (int) fileScans().stream().flatMap(f -> f.deletes().stream())
+ .filter(d -> d.content().equals(FileContent.EQUALITY_DELETES))
+ .count();
+ }
+
+ public int rewrittenPosDeletes() {
+ return (int) fileScans().stream().flatMap(f -> f.deletes().stream())
+ .filter(d -> d.content().equals(FileContent.POSITION_DELETES))
+ .count();
+ }
Review Comment:
I think @singhpk234 is right here. The equivalent thing we do in the
rewrittenFiles code is collecting as a set and then we call size on the set.
Unless we are counting just the pure number of references to delete files.
In that case i'm not sure what the "rewritten" part would mean in the api name?
The other confusing thing here for me is that these files are not actually
rewritten or deleted.
At the moment I think these would be
"referencedPositionalDeletes" or something like that
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]