RussellSpitzer commented on issue #3494: URL: https://github.com/apache/iceberg/issues/3494#issuecomment-1638755071
In Iceberg partitions are really just tuples associated with file metadata, so number of partitions really doesn't matter except as it defines file size. In general, you want files between 128 - 512 (or some multiple of your row group size). The opening file cost is different in different file systems and some have better performance with smaller vs larger files. Those partition tuples though are used during queries, so I think you generally want the most granular partitioning you can get so you can support more types of queries. Like I mentioned above, you can get similar effects though with sorting although that tends to me a little more expensive at write time or by doing sort optimizes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
