pvary commented on issue #2723: URL: https://github.com/apache/iceberg/issues/2723#issuecomment-867438599
I am really interested everyone's thoughts on this. My main concern with Iceberg is that we have an inherent bottleneck for writes, and as it is mentioned before it contains 2 steps: 1. Write the snapshot json - the size increases with every commit 2. Commit the write using the catalog Even though the we can write the snapshot json-s concurrently, if there are concurrent writes we still have to throw away and rewrite the snapshot json for the 2nd commit, so basically the commit is bottlenecked by the 2 steps above. I like Iceberg a lot, but based on this Hive ACID tables are still superior in use-cases where there are multiple concurrent writes. Having the option to use "change set approach" might alleviate this bottleneck. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org