Zoltán Borók-Nagy created IMPALA-11591:
------------------------------------------

             Summary: Avoid calling planFiles() on Iceberg tables when there 
are no predicates
                 Key: IMPALA-11591
                 URL: https://issues.apache.org/jira/browse/IMPALA-11591
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Zoltán Borók-Nagy


Currently we always invoke Iceberg's planFiles() API for creating Iceberg scans.

When there are no predicates (and no time travel) on the table we could avoid 
that because we already cache everything we need (schema, partition 
information, file descriptors).

We can also consider only pushing down predicates if at least one of the 
predicates refer to a partition column. Otherwise it's possible that the 
overhead of reading, decoding, evaluating all the manifest files is too large.

I think the change should be fairly simple, we just need to take care:
 * store delete files separately, so we can still do the V2 scans from cache
 * During time-travel we also cache old file descriptors, so we need to 
separate them from the actual snapshot's file descriptors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to