[GitHub] [flink-web] XComp commented on a change in pull request #466: Add Apache Flink release 1.14.0

GitBox Fri, 10 Sep 2021 03:32:49 -0700


XComp commented on a change in pull request #466:
URL: https://github.com/apache/flink-web/pull/466#discussion_r706039421




##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache

Review comment:
       ```suggestion
   Just a couple of days ago the Apache Software Foundation announced its 
annual report and Apache
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use 
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user 
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced 
to provide a coherent

Review comment:
       ```suggestion
   Flink wants to provide. With Apache Flink 1.14 hybrid sources were 
introduced to provide a coherent
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and

Review comment:
       ```suggestion
   together. The first sinks and sources are now providing a unified API 
(following 
[FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
 and
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope

Review comment:
       Could we list the feature that were removed here?

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.

Review comment:
       ```suggestion
   community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time that these FLIPs have
   been truly implemented for Kafka source and sink.
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.

Review comment:
       ```suggestion
   That’s not all there is a huge list of improvements and new additions 
throughout all components.
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have

Review comment:
       ```suggestion
   community came up with FLIP-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
   ```
   Not sure whether we want to link the FLIPs here a second time ¯\_(ツ)_/¯

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its

Review comment:
       ```suggestion
   Table API. Under the hood checkpoints are allowed even after tasks are 
finished truly enabling mixed
   or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
   DataStream to Table API and SQL and vice versa. The DataStream batch mode is 
maturing after its
   ```
   All the other occurrences of Table API are written with a space in this blog 
post. 

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the

Review comment:
       ```suggestion
   
[FLIP-143](https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API)).
 Hybrid source has been introduced. The DataStream batch mode has been pushed 
to the
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use 
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user 
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced 
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data 
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to 
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around 
the DataStream API.
+With the traction of Flink growing the number of users that have more data 
science and business
+intelligence background than software engineering is growing. The entry point 
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but 
soon the requirements
+are growing and more and more features that are only exposed in the lower 
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as 
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience. 
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the 
Table API pushing the
+feature from the lower level API to the higher level one. But there are also 
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream 
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the 
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python 
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a 
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This 
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are 
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded 
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache 
Flink became what it is
+by enabling use cases in stream processing that have not been possible before. 
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long 
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With 
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of 
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that 
is in the buffers. In
+this release the community will expose a beta feature called buffer 
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have 
been of fixed size
+until this release. Now they changed towards self-optimising themselves 
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based 
use cases within
+those two types there are uncountable options on how a data processing use 
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources 
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming 
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability. 
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what 
is happening there
+closely. The telemetry data of a connector can also provide important pointers 
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors 
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done 
when it comes to
+connectors. The Apache community will stress improving the connector system. 
In this release we
+added the Pulsar connectors as well as a testing framework.

Review comment:
       A testing framework for the Pulsar connector? Or for connectors in 
general?

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use 
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user 
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced 
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data 
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to 
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around 
the DataStream API.
+With the traction of Flink growing the number of users that have more data 
science and business
+intelligence background than software engineering is growing. The entry point 
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but 
soon the requirements
+are growing and more and more features that are only exposed in the lower 
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as 
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience. 
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the 
Table API pushing the
+feature from the lower level API to the higher level one. But there are also 
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream 
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the 
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python 
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a 
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This 
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are 
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded 
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache 
Flink became what it is
+by enabling use cases in stream processing that have not been possible before. 
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long 
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With 
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of 
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that 
is in the buffers. In

Review comment:
       ```suggestion
   slow checkpoints are the huge size of them which is mostly caused by 
buffered data. In
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use 
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user 
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced 
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data 
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to 
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around 
the DataStream API.
+With the traction of Flink growing the number of users that have more data 
science and business
+intelligence background than software engineering is growing. The entry point 
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but 
soon the requirements
+are growing and more and more features that are only exposed in the lower 
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as 
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience. 
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the 
Table API pushing the
+feature from the lower level API to the higher level one. But there are also 
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream 
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the 
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python 
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a 
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This 
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are 
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded 
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache 
Flink became what it is
+by enabling use cases in stream processing that have not been possible before. 
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long 
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With 
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of 
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that 
is in the buffers. In
+this release the community will expose a beta feature called buffer 
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have 
been of fixed size
+until this release. Now they changed towards self-optimising themselves 
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based 
use cases within
+those two types there are uncountable options on how a data processing use 
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources 
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming 
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability. 
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what 
is happening there
+closely. The telemetry data of a connector can also provide important pointers 
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors 
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done 
when it comes to
+connectors. The Apache community will stress improving the connector system. 
In this release we
+added the Pulsar connectors as well as a testing framework.
+
+## Bigger, faster, stronger SQL
+
+Since the SQL interface was added it got a lot of traction. It is obvious that 
this will be an
+important touch point for users in the short, mid and long term. Implementing 
a SQL based interface
+not only opens up a lot of opportunities but also generates a lot of inquiries 
for features of that
+comprehensive language. Features that are not as easy to follow up especially 
as this language was
+defined for databases and not data processors. Besides unifying the APIs and 
exposing connectors
+there the contributors also improved the SQL Client, added table valued 
functions and much more.
+
+## Python
+
+The python API also got more observability features and optimisations.
+
+# Goodbye to the legacy planner
+
+Maintaining an open source project also means saying good-bye to some beloved 
features eventually.
+This is important to keep a project healthy and reduce outgrows that cause 
maintainability issues.
+For sure this also means that some users might need to change their 
implementation and in some cases
+things that have been possible might not be possible anymore. Apache Flink 
always follows the proper
+lifecycle. Having a look the roadmap provides a lot of transparency to set the 
expectation of users

Review comment:
       ```suggestion
   lifecycle. Having a look at the roadmap provides a lot of transparency to 
set the expectation of users
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify
+the data in the first place. Till now the only way to achieve some of the use 
cases was to have two
+parallel Flink jobs or to implement that in a hacky way. This is not the user 
experience Apache
+Flink wants to provide. With Apache Flink 1.14 hybrid sources was introduced 
to provide a coherent
+experience in unifying heterogenous data feeds into one homogenous data 
stream. So you might have a
+file source to load historic data and then switch over to a Kafka source to 
cover the streaming
+data.
+
+## Aligning DataStream API, Table API and Flink SQL
+
+With the DataSet API being deprecated the future of Flink will circle around 
the DataStream API.
+With the traction of Flink growing the number of users that have more data 
science and business
+intelligence background than software engineering is growing. The entry point 
into Apache Flink is
+often Flink SQL and the Table API. At first it might be simple use cases but 
soon the requirements
+are growing and more and more features that are only exposed in the lower 
level APIs are needed. For
+Flink it is a goal to expose all functionalities on all levels as much as 
possible. As mentioned,
+this especially applies to the unified stream and batch processing experience. 
With 1.14 there are
+several improvements in that direction. Now the batch mode is exposed in the 
Table API pushing the
+feature from the lower level API to the higher level one. But there are also 
improvements in the
+other direction like allowing Table API Pipelines being used in the DataStream 
API. The new Kafka
+source has been, as mentioned, exposed in the Table API and Flink SQL. In the 
Python API there have
+also been some improvements by e.g. supporting UDF chaining in the Python 
DataStream API and general
+improvements regarding the DataStream and Table API.
+
+## Unified Checkpointing Experience
+
+One of the biggest effort, that is not really user facing, but will have a 
huge impact is unifying
+the checkpointing experience between bounded and unbounded data streams. This 
his been a huge amount
+of work too. Essentially this means allowing checkpoints after some tasks are 
finished. This is
+often the case in heterogeneous environments mixing bounded and unbounded 
streams (and sources) and
+has a huge impact in the exactly once delivery guarantee.
+
+# Notable improvements regarding production readiness
+
+The previous section was all about improving the user experience, but Apache 
Flink became what it is
+by enabling use cases in stream processing that have not been possible before. 
Some of the biggest
+data processing use cases in the industry are built on top of Apache Flink.
+
+## Buffer debloating
+
+Running huge use cases processing a lot of data in Apache Flink means long 
checkpointing times and
+therefor a rather poor experience when it comes to fault tolerance. With 
unaligned checkpoints a
+feature to target this experience was recently added to Apache Flink. One of 
the reasons for the
+slow checkpoints are the huge size of them, this is mostly caused by data that 
is in the buffers. In
+this release the community will expose a beta feature called buffer 
debloating. This is essentially
+trying to minimise the data that are used before and after a task. They have 
been of fixed size
+until this release. Now they changed towards self-optimising themselves 
reducing the buffer size
+significantly and therefor reducing the checkpointing time a lot.
+
+## Fine grained resource and network buffer management
+
+Apache Flink is a data processor. Even by acknowledging stream and batch based 
use cases within
+those two types there are uncountable options on how a data processing use 
case might look. To get
+the optimum out of Flink there are now more options on how to manage resources 
moving from coarse
+grained resource management to a fine grained one.
+
+There is also an improvement on how to do network buffer management overcoming 
some existing
+limitations. All in all this should require less network memory.
+
+## Connector metrics
+
+Moving use cases into production usually also means increasing observability. 
Connectors are the
+entry and the exit for a Flink job and it usually makes sense to monitor what 
is happening there
+closely. The telemetry data of a connector can also provide important pointers 
to narrow down
+problems or bottlenecks. In 1.14. there are default metrics for connectors 
introduced and also
+implemented for some of the connectors.
+
+# Other improvements
+
+## Building a connector ecosystem
+
+Metrics and API unification have not been the only thing that has been done 
when it comes to
+connectors. The Apache community will stress improving the connector system. 
In this release we
+added the Pulsar connectors as well as a testing framework.
+
+## Bigger, faster, stronger SQL
+
+Since the SQL interface was added it got a lot of traction. It is obvious that 
this will be an
+important touch point for users in the short, mid and long term. Implementing 
a SQL based interface
+not only opens up a lot of opportunities but also generates a lot of inquiries 
for features of that
+comprehensive language. Features that are not as easy to follow up especially 
as this language was
+defined for databases and not data processors. Besides unifying the APIs and 
exposing connectors
+there the contributors also improved the SQL Client, added table valued 
functions and much more.
+
+## Python
+
+The python API also got more observability features and optimisations.

Review comment:
       ```suggestion
   The Python API also got more observability features and optimisations.
   ```

##########
File path: _posts/2021-09-21-release-1.14.0.md
##########
@@ -0,0 +1,182 @@
+---
+layout: post 
+title:  "Apache Flink 1.14.0 Release Announcement"
+date: 2021-09-21T08:00:00.000Z 
+categories: news 
+authors:
+- joemoe:
+  name: "Johannes Moser"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.14.0! Around xxx contributors worked on over xxxx issues to TODO.
+---
+
+Just a couple of days ago the Apache Software Foundation announced it’s annual 
report and Apache
+Flink being in the Top 5 in all relevant categories is just an outcome of the 
work of the
+community (that has been done yet again) for 1.14.0. The consistency how this 
project is moving
+forward is remarkable. Once again 200 plus contributors worked on over 1,000 
issues.
+
+Apache Flink not only supports batch and stream processing, but has been 
always following the goal
+of making it a unified experience. With Apache Flink 1.14.0 batch and stream 
processing moved closer
+together. The first sinks and sources are now providing a unified API 
(following FLIP-27 and
+FLIP-143). Hybrid source has been introduced. The DataStream batch mode has 
been pushed to the
+TableAPI. Under the hood checkpoints are allowed even after tasks are finished 
truly enabling mixed
+or bounded jobs. Existing features have been haromised throughout all 
available APIs. From
+DataStream to TableAPI and SQL and vice versa. The DataStream batch mode is 
maturing after its
+initial release in 1.13.0.
+
+Fault tolerance is part of Flink’s nature, still it can never be enough. By 
using the new option of
+debloating the buffers the checkpoint size can decrease significantly and 
reduce checkpointing times
+to a minimum.
+
+That’s not all there is a huge list of improvements and new additions through 
out all components.
+Also we had to say goodbye to some features that have been superseded in 
recent releases. We hope
+you like the new release and we’d be eager to learn about your experience with 
it, which yet
+unsolved problems it solves, what new use-cases it unlocks for you.
+
+{% toc %}
+
+# Notable improvements for a Unified Batch and Stream Processing experience
+
+Apache Flink unlocks both batch and stream processing use cases. There is a 
lot of traction for
+both. Initially both features were rather separated, that’s why the API were 
not really aligned in
+the first place and also moved into apart over time. With both APIs being 
stable users started to
+combine them in their solutions. Having batch style workloads to initially 
process historic data and
+then switching into the streaming mode to deal with live data is something 
that makes sense. But
+having the two APIs separated not only lead to the mentioned differences but 
also to big white spots
+on the sparsity matrices and it became quite confusing what worked with which 
API and what not.
+About a year ago the community started to unify the experience by seeing batch 
as a special case of
+streaming. The notion of bounded and unbounded streams has been introduced and 
initially released in
+the most recent Apache Flink release 1.13. Now this effort has been continued 
as Apache Flink not
+only wants to unlock use cases bot also making it a good user experience. The 
list below
+demonstrates the impact of this change. It is not only about the now 
deprecated DataSet API and the
+DataStream API, it also affects sources and sinks, checkpointing, the Table 
API, Flink SQL,…
+
+## Unified Source and Sink APIs
+
+Sources and sinks play a big role to unlock both streaming and batch use 
cases. There’s quite a list
+of sources and sinks currently supported included in Apache Flink. There are 
also some external
+packages available. It might be hard to find two that support the same set of 
features. For sure
+this applies to supporting bounded and unbounded streams, but there are also 
differences on what is
+exposed in the Table API and SQL and what kind of checkpointing is supported. 
That’s why the
+community came up with FLINK-27 and FLIP-143. With Apache Flink 1.14. it is 
the first time they have
+been truly implemented the FLIPs for Kafka source and sink.
+
+The changes in the sink circle mostly around committing behaviour to enable 
all delivery guarantees
+to provide solid fault tolerance.
+
+The Kafka source has already been in good shape. This is now also exposed in 
the Table API and SQL.
+
+## Hybrid Sources
+
+User are facing the problem of having more and more sources for data which 
requires them to unify

Review comment:
       ```suggestion
   Users are facing the problem of having more and more sources for data which 
requires them to unify
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-web] XComp commented on a change in pull request #466: Add Apache Flink release 1.14.0

Reply via email to