Re: Micro-Batch Streaming

2019-05-07 Thread Anton Okolnychyi
> I'm reluctant to do this without an explicit call from the user or in a > service. The problem is when to expire snapshots. Iceberg is called regularly > to read and write tables. That might seem like a good time to expire > snapshots, but it doesn't make sense for either one to have a side ef

Re: Micro-Batch Streaming

2019-05-06 Thread Ryan Blue
Replies inline. On Mon, May 6, 2019 at 3:01 PM Anton Okolnychyi wrote: > I am also wondering whether it makes sense to have a config that limits > the number of snapshot we want to track. This config can be based on the > number of snapshots (e.g. keep only 1 snapshots) or based on time (e.g

Re: Micro-Batch Streaming

2019-05-06 Thread Anton Okolnychyi
That’s good news, Ryan. Your observations are also aligned with some benchmarks I performed earlier. I am also wondering whether it makes sense to have a config that limits the number of snapshot we want to track. This config can be based on the number of snapshots (e.g. keep only 1 snapsho

Re: Micro-Batch Streaming

2019-05-06 Thread Ryan Blue
We've been building pipelines that write to Iceberg tables from Flink. Right now, we have applications deployed across 3 AWS regions and have them committing every 10 minutes. We also have an application that monitors the tables and moves files from remote regions into the region where we run our H