szehon-ho commented on issue #2326:
URL: https://github.com/apache/iceberg/issues/2326#issuecomment-798797152


   Nice, I think with aggregate pushdown it could potentially open many faster 
queries in Iceberg overall, given how much metadata we have, if we are ok to 
answer queries with them.
   
   I think #2182 is interesting, but it might be true that it's not worth the 
cost if it adds a ton more time for each commit.
   
   @aokolnychyi  I gave a try today for doing equivalent query on files table, 
it's much faster (~minute vs ~10s of minutes).  I even added distinct in the 
end and it does not add much time.  It's a shame that users first try 
partitions table and not files table then.  I guess there's not much we can do 
unless we have this support?
   
   By the way as Russell was pointing to me, I was looking at making an 
improvement by adding predicate pushdown using ManifestEvaluator to filter out 
manifest-files, as the Manifest List has each manifest-file's partition 
min/max.  If I understand correctly, it requires converting a filter on the 
"partition" table (partition.part_field = x) to a ManifestGroup "partition 
filter" (part_field = x).  Now I think if this functionality will be compatible 
with later rewriting partitions table to use view of files table (I guess, at 
that point, we make the equivalent pushdown on file-table using 'partition.x' 
field which is not done today there either).
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to