[GitHub] [iceberg] aokolnychyi edited a comment on issue #2326: Partition Table Performance

GitBox Fri, 12 Mar 2021 11:43:24 -0800


aokolnychyi edited a comment on issue #2326:
URL: https://github.com/apache/iceberg/issues/2326#issuecomment-797711579



   Well, it was expected the partitions table would not perform that well when 
a lot of metadata is present. I had 
[this](https://github.com/apache/iceberg/pull/655#discussion_r347950984) 
comment on the original PR.
   
   Doing an aggregate on top of the files metadata table in Spark would be way 
faster but not instant. Have you tried that as a temp solution, @szehon-ho? I 
just wonder what's the difference in terms of performance. You can tune the 
input split size for metadata tables to have reasonable parallelism.
   
   There is also an interesting proposal in #2182. Can we potentially benefit 
from it here?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi edited a comment on issue #2326: Partition Table Performance

Reply via email to