jackye1995 commented on issue #2723: URL: https://github.com/apache/iceberg/issues/2723#issuecomment-867133182
Adding responses from Slack: @SreeramGarlapati in our streaming write tests - we are seeing that metadata.json file grows to 3mb - with in couple of hours of streaming writes (at 100 events per sec rate). At this point - each write is taking several seconds @jackye1995 I think I replied in the dev mailing list about this, please let me know if that is missing anything. For streaming, my understanding is that Iceberg’s expected streaming commit rate is at one per few seconds, 100 events per second sounds like too much for it usage. But maybe I am just underestimating its potential. Plus if you commit more often, you should also expire old snapshots more often, especially in the streaming use case. @SreeramGarlapati One concern with - expiring snapshots more often is that - we will lose the ability to stream incrementally (like we do here). We want to be able to stream incrementally thru the snapshots - atleast for ~1 day. & to your suggestion on commit rate - the commit rate we were using is 1 commit a sec (and as the time progresses - as each commit is taking a bit longer - 1 commit after multiple seconds) - just that - it has several events in them - which could span across couple of partitions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org