[GitHub] [iceberg] pvary commented on issue #2723: Support for low latency writes to iceberg table

GitBox Thu, 24 Jun 2021 01:17:26 -0700


pvary commented on issue #2723:
URL: https://github.com/apache/iceberg/issues/2723#issuecomment-867438599



   I am really interested everyone's thoughts on this.
   My main concern with Iceberg is that we have an inherent bottleneck for 
writes, and as it is mentioned before it contains 2 steps:
   1. Write the snapshot json - the size increases with every commit
   2. Commit the write using the catalog
   
   Even though the we can write the snapshot json-s concurrently, if there are 
concurrent writes we still have to throw away and rewrite the snapshot json for 
the 2nd commit, so basically the commit is bottlenecked by the 2 steps above.
   
   I like Iceberg a lot, but based on this Hive ACID tables are still superior 
in use-cases where there are multiple concurrent writes. Having the option to 
use "change set approach" might alleviate this bottleneck.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] pvary commented on issue #2723: Support for low latency writes to iceberg table

Reply via email to