szehon-ho commented on code in PR #4812:
URL: https://github.com/apache/iceberg/pull/4812#discussion_r922355758
##########
core/src/main/java/org/apache/iceberg/MetadataColumns.java:
##########
@@ -53,6 +53,8 @@ private MetadataColumns() {
public static final String DELETE_FILE_ROW_FIELD_NAME = "row";
public static final int DELETE_FILE_ROW_FIELD_ID = Integer.MAX_VALUE - 103;
public static final String DELETE_FILE_ROW_DOC = "Deleted row values";
+ public static final int POSITION_DELETE_TABLE_PARTITION_FIELD_ID =
Integer.MAX_VALUE - 104;
Review Comment:
Thanks for taking a look. It's getting a bit more messy than anticipated.
Was spending a lot of time yesterday looking at the marker FileScanTask
approach and it does get messy. Especially trying to add another constant
column (spec_id) to this table, not only is it another use of metadata column
to populate it, we have to figure out how to make a residual expression without
the constant column as otherwise the default file read code tries to filter too
aggressively if there is a spec_id filter (as the stats do not exist on the
file and the pruning code skips it) . In short , adding any more constant
columns to this table, gets a bit messy.
Makes sense to explore using some kind of StaticDataTask. I think the
problem for that, currently it would not do split() and implement any file
residual filtering on position-delete row values if there are any filters
there, and of course vectorization, and it'll be re-implementing these if we
want it. But maybe I am over-optimizing this and in the first cut we can live
without those and add them later. Can look at this approach, and see if I hit
any major blockers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]