infoverload commented on code in PR #516:
URL: https://github.com/apache/flink-web/pull/516#discussion_r864760968
##########
_posts/2022-04-01-tidying-snapshots-up.md:
##########
@@ -0,0 +1,213 @@
+---
+layout: post
+title: "Tidying up snapshots"
+date: 2022-04-01T00:00:00.000Z
+authors:
+- dwysakowicz:
+ name: "Dawid Wysakowicz"
+ twitter: "dwysakowicz"
+
+excerpt: TODO
+
+---
+
+{% toc %}
+
+Over the years, Flink has become a well established project in the data
streaming domain and a
+mature project requires a slight shift of priorities from thinking purely
about new features
+towards caring more about stability and operational simplicity. The Flink
community has tried to address
+some known friction points over the last couple of releases, which includes
improvements to the
+snapshotting process.
+
+Flink 1.13 was the first release we announced [unaligned
checkpoints]({{site.DOCS_BASE_URL}}flink-docs-release-1.15/docs/concepts/stateful-stream-processing/#unaligned-checkpointing)
to be production-ready and
+encourage people to use them if their jobs are backpressured to a point where
it causes issues for
+checkpoints. It was also the release where we [unified the binary format of
savepoints](/news/2021/05/03/release-1.13.0.html#switching-state-backend-with-savepoints)
across all
+different state backends, which allows for stateful switching of those. More
on that a bit later.
+
+The next release, 1.14 also brought additional improvements. As an alternative
and as a complement
+to unaligned checkpoints we introduced a feature, we called ["buffer
debloating"](/news/2021/09/29/release-1.14.0.html#buffer-debloating). It is
build
+around the concept of automatically adjusting the amount of in-flight data
that needs to be aligned
+while snapshotting. Another long-standing problem, we fixed, was that from
1.14 onwards it is
+possible to [continue checkpointing even if there are finished
tasks](/news/2021/09/29/release-1.14.0.html#checkpointing-and-bounded-streams)
in ones jobgraph.
Review Comment:
```suggestion
Flink has become a well established data streaming engine and a
mature project requires some shifting of priorities from thinking purely
about new features
towards improving stability and operational simplicity. In the last couple
of releases, the Flink community has tried to address
some known friction points, which includes improvements to the
snapshotting process. Snapshotting takes a global, consistent image of the
state of a Flink job and is integral to fault-tolerance and exacty-once
processing. Snapshots include savepoints and checkpoints.
This post will outline the journey of improving snapshotting in past
releases and the upcoming improvements in Flink 1.15, which includes making it
possible to take savepoints in the native state backend specific format as well
as clarifying snapshots ownership.
{% toc %}
# Past improvements to the snapshotting process
Flink 1.13 was the first release where we announced [unaligned
checkpoints]({{site.DOCS_BASE_URL}}flink-docs-release-1.15/docs/concepts/stateful-stream-processing/#unaligned-checkpointing)
to be production-ready. We
encouraged people to use them if their jobs are backpressured to a point
where it causes issues for
checkpoints. We also [unified the binary format of
savepoints](/news/2021/05/03/release-1.13.0.html#switching-state-backend-with-savepoints)
across all
different state backends, which enables stateful switching of savepoints.
Flink 1.14 also brought additional improvements. As an alternative and as a
complement
to unaligned checkpoints, we introduced a feature called ["buffer
debloating"](/news/2021/09/29/release-1.14.0.html#buffer-debloating). This is
built
around the concept of automatically adjusting the amount of in-flight data
that needs to be aligned
while snapshotting. We also fixed another long-standing problem and made it
possible to [continue checkpointing even if there are finished
tasks](/news/2021/09/29/release-1.14.0.html#checkpointing-and-bounded-streams)
in a JobGraph.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]