XComp commented on a change in pull request #466:
URL: https://github.com/apache/flink-web/pull/466#discussion_r706039421
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
Review comment:
```suggestion
Just a couple of days ago the Apache Software Foundation announced its
annual report and Apache
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
Review comment:
```suggestion
Flink wants to provide. With Apache Flink 1.14 hybrid sources were
introduced to provide a coherent
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
Review comment:
```suggestion
together. The first sinks and sources are now providing a unified API
(following
[FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
and
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
Review comment:
Could we list the feature that were removed here?
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
Review comment:
```suggestion
community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time that these FLIPs have
been truly implemented for Kafka source and sink.
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
Review comment:
```suggestion
That’s not all there is a huge list of improvements and new additions
throughout all components.
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
Review comment:
```suggestion
community came up with FLIP-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
```
Not sure whether we want to link the FLIPs here a second time ¯\_(ツ)_/¯
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
Review comment:
```suggestion
Table API. Under the hood checkpoints are allowed even after tasks are
finished truly enabling mixed
or bounded jobs. Existing features have been haromised throughout all
available APIs. From
DataStream to Table API and SQL and vice versa. The DataStream batch mode is
maturing after its
```
All the other occurrences of Table API are written with a space in this blog
post.
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
Review comment:
```suggestion
[FLIP-143](https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API)).
Hybrid source has been introduced. The DataStream batch mode has been pushed
to the
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around
the DataStream API.
+With the traction of Flink growing the number of users that have more data
science and business
+intelligence background than software engineering is growing. The entry point
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but
soon the requirements
+are growing and more and more features that are only exposed in the lower
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience.
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the
Table API pushing the
+feature from the lower level API to the higher level one. But there are also
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache
Flink became what it is
+by enabling use cases in stream processing that have not been possible before.
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that
is in the buffers. In
+this release the community will expose a beta feature called buffer
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have
been of fixed size
+until this release. Now they changed towards self-optimising themselves
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based
use cases within
+those two types there are uncountable options on how a data processing use
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability.
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what
is happening there
+closely. The telemetry data of a connector can also provide important pointers
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done
when it comes to
+connectors. The Apache community will stress improving the connector system.
In this release we
+added the Pulsar connectors as well as a testing framework.
Review comment:
A testing framework for the Pulsar connector? Or for connectors in
general?
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around
the DataStream API.
+With the traction of Flink growing the number of users that have more data
science and business
+intelligence background than software engineering is growing. The entry point
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but
soon the requirements
+are growing and more and more features that are only exposed in the lower
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience.
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the
Table API pushing the
+feature from the lower level API to the higher level one. But there are also
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache
Flink became what it is
+by enabling use cases in stream processing that have not been possible before.
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that
is in the buffers. In
Review comment:
```suggestion
slow checkpoints are the huge size of them which is mostly caused by
buffered data. In
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around
the DataStream API.
+With the traction of Flink growing the number of users that have more data
science and business
+intelligence background than software engineering is growing. The entry point
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but
soon the requirements
+are growing and more and more features that are only exposed in the lower
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience.
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the
Table API pushing the
+feature from the lower level API to the higher level one. But there are also
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache
Flink became what it is
+by enabling use cases in stream processing that have not been possible before.
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that
is in the buffers. In
+this release the community will expose a beta feature called buffer
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have
been of fixed size
+until this release. Now they changed towards self-optimising themselves
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based
use cases within
+those two types there are uncountable options on how a data processing use
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability.
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what
is happening there
+closely. The telemetry data of a connector can also provide important pointers
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done
when it comes to
+connectors. The Apache community will stress improving the connector system.
In this release we
+added the Pulsar connectors as well as a testing framework.
+
+## Bigger, faster, stronger SQL
+
+Since the SQL interface was added it got a lot of traction. It is obvious that
this will be an
+important touch point for users in the short, mid and long term. Implementing
a SQL based interface
+not only opens up a lot of opportunities but also generates a lot of inquiries
for features of that
+comprehensive language. Features that are not as easy to follow up especially
as this language was
+defined for databases and not data processors. Besides unifying the APIs and
exposing connectors
+there the contributors also improved the SQL Client, added table valued
functions and much more.
+
+## Python
+
+The python API also got more observability features and optimisations.
+
+# Goodbye to the legacy planner
+
+Maintaining an open source project also means saying good-bye to some beloved
features eventually.
+This is important to keep a project healthy and reduce outgrows that cause
maintainability issues.
+For sure this also means that some users might need to change their
implementation and in some cases
+things that have been possible might not be possible anymore. Apache Flink
always follows the proper
+lifecycle. Having a look the roadmap provides a lot of transparency to set the
expectation of users
Review comment:
```suggestion
lifecycle. Having a look at the roadmap provides a lot of transparency to
set the expectation of users
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around
the DataStream API.
+With the traction of Flink growing the number of users that have more data
science and business
+intelligence background than software engineering is growing. The entry point
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but
soon the requirements
+are growing and more and more features that are only exposed in the lower
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience.
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the
Table API pushing the
+feature from the lower level API to the higher level one. But there are also
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache
Flink became what it is
+by enabling use cases in stream processing that have not been possible before.
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that
is in the buffers. In
+this release the community will expose a beta feature called buffer
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have
been of fixed size
+until this release. Now they changed towards self-optimising themselves
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based
use cases within
+those two types there are uncountable options on how a data processing use
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability.
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what
is happening there
+closely. The telemetry data of a connector can also provide important pointers
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done
when it comes to
+connectors. The Apache community will stress improving the connector system.
In this release we
+added the Pulsar connectors as well as a testing framework.
+
+## Bigger, faster, stronger SQL
+
+Since the SQL interface was added it got a lot of traction. It is obvious that
this will be an
+important touch point for users in the short, mid and long term. Implementing
a SQL based interface
+not only opens up a lot of opportunities but also generates a lot of inquiries
for features of that
+comprehensive language. Features that are not as easy to follow up especially
as this language was
+defined for databases and not data processors. Besides unifying the APIs and
exposing connectors
+there the contributors also improved the SQL Client, added table valued
functions and much more.
+
+## Python
+
+The python API also got more observability features and optimisations.
Review comment:
```suggestion
The Python API also got more observability features and optimisations.
```
##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post
+title: "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z
+categories: news
+authors:
+- joemoe:
+ name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000
issues.
+
+Apache Flink not only supports batch and stream processing, but has been
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream
processing moved closer
+together. The first sinks and sources are now providing a unified API
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through
out all components.
+Also we had to say goodbye to some features that have been superseded in
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a
lot of traction for
+both. Initially both features were rather separated, that’s why the API were
not really aligned in
+the first place and also moved into apart over time. With both APIs being
stable users started to
+combine them in their solutions. Having batch style workloads to initially
process historic data and
+then switching into the streaming mode to deal with live data is something
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which
API and what not.
+About a year ago the community started to unify the experience by seeing batch
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The
list below
+demonstrates the impact of this change. It is not only about the now
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are
also some external
+packages available. It might be hard to find two that support the same set of
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported.
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which
requires them to unify
Review comment:
```suggestion
Users are facing the problem of having more and more sources for data which
requires them to unify
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]