BsoBird commented on issue #7406: URL: https://github.com/apache/iceberg/issues/7406#issuecomment-1527025476
@bharos The main problem is that the more partitions you have on a table, the slower the writes and the more likely the job is to have problems. Even if you use a non-ACID type job to process the data, you may cause data loss. So I don't want too many partitions. For example, I have a 10TB table, if I don't set up partitioning, it will take me 40 minutes to rewrite the whole table. When I use the year+month partition, it takes me 8-10 hours to rewrite it once. And, if I don't do a redistribution of the data, my job will immediately report an error when I write a large number of partitions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
