twalthr commented on a change in pull request #466:
URL: https://github.com/apache/flink-web/pull/466#discussion_r706001267
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around
the DataStream API.
Review comment:
On the ML we actually decided to have a dedicated blog post again, but
this didn't happen yet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]