rdsr opened a new issue #765: [Incremental Scan] Follow ups for #315
URL: https://github.com/apache/incubator-iceberg/issues/765
There are 3 follow-ups which need to be addressed in #315
Caching residual evaluators by spec ID
Comment from #315
> Minor for follow-up: it would be nice to cache residual evaluators by spec
ID so that we don't create one per manifest.
Refinement and validation for `appendsBetween` api
Comment from #315
> Should this ensure that newFromSnapshot is an ancestor of newToSnapshot?
> This probably doesn't need to be done in this commit, but it would be a
good follow-up to ensure that the range exists.
> Since this is a refinement, it may also be a good idea to make this a
subset of the existing selected range. That is, both newFromSnapshotId and
newToSnapshotId must be in the existing range of fromSnapshotId to toSnapshotId.
> Putting it another way, when I create a scan using appendsBetween(A,
C).appendsBetween(B, D), what should the behavior be? I'd say that is
concerning because D is outside the original range. Probably a good idea to
fail instead.
Use a single manifestgroup for doing planning in a single run
Comment from #315
> As a follow-up, we can change the logic slightly to only require one
ManifestGroup. As long as we read each manifest that was added in a selected
snapshot only once and only select ADDED files with the right snapshot ID, we
can do the planning in a single run.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
With regards,
Apache Git Services
-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org