szehon-ho opened a new issue #3211:
URL: https://github.com/apache/iceberg/issues/3211


   Some tables may end up in a bad state with tens of thousands of 
ManifestFiles.  RewriteManifests, even with a very restrictive predicate on the 
Manifests, ends up not being able to finish in a reasonable time.  (Imagine the 
table is the target of a Flink ingest job, for example).
   
   This is because all Manifests have to be read in order to evaluate the 
Predicate.
   
   So, propose to add a rewriteIf(Expression filter) on the RewriteManifest API 
as an alternate to rewriteIf(Predicate<ManifestFIle>).  This is then able to 
push down into the ManifestGroup and reduce the ManifestFiles to read, to be 
compacted.
   
   Also, it is more user-friendly.  Trying to filter ManifestFile with the 
predicate using the ManifestFile.PartitionFieldSummary() is not user-friendly 
at all as it's upper and lower bounds are in byte format.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to