Samrat002 commented on PR #28268: URL: https://github.com/apache/flink/pull/28268#issuecomment-4633923081
> > Yes, you are right. That was initially discussed. what we are observing at the scale of production, users don't really set policies. There are billions of MPU get accumulated and, leading to high cost. > > Could you please elaborate on how storing subparts in the state is linked to the billing problem. Aren't aborted MPUs introducing all the same dangling S3 objects? > > My general question was more about why do we store subparts as separate tail files to resume from on S3. Are they as good as the inline Flink state in terms of data corruption risks? My bad, I misunderstood and correlated different things. Two reasons I went with S3 objects over inlining in state: 1. Checkpoint cost. Tails can be up to part-size 5 MiB+, often larger. Inlining per writer per checkpoint inflates checkpoint payload through the JM/state backend. At scale that's a real cost vs. a single S3 PUT. 2. Durability is the same. The state backend is usually S3 too, so a tail object gives us the same 11-9s either way. State doesn't strengthen the guarantee, just shifts where the bytes live. On lifecycle, tail objects live under deterministic keys we own, and deletion is driven by our commit/abort/recovery path not by a bucket lifecycle policy. So cleanup is as predictable as state GC, without the checkpoint-size hit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
