This is an automated email from the ASF dual-hosted git repository.
chesnay pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 104f831 Rebuild website
104f831 is described below
commit 104f831ad72a80653f20f84842e4e71ddbd4db06
Author: Chesnay Schepler <[email protected]>
AuthorDate: Thu Jan 20 09:42:14 2022 +0100
Rebuild website
---
content/2022/01/20/pravega-connector-101.html | 504 ++++++++++++++++++++++++++
content/blog/feed.xml | 354 +++++++++++++-----
content/blog/index.html | 36 +-
content/blog/page10/index.html | 38 +-
content/blog/page11/index.html | 40 +-
content/blog/page12/index.html | 38 +-
content/blog/page13/index.html | 37 +-
content/blog/page14/index.html | 39 +-
content/blog/page15/index.html | 38 +-
content/blog/page16/index.html | 37 +-
content/blog/page17/index.html | 39 +-
content/blog/page18/index.html | 25 ++
content/blog/page2/index.html | 36 +-
content/blog/page3/index.html | 38 +-
content/blog/page4/index.html | 38 +-
content/blog/page5/index.html | 36 +-
content/blog/page6/index.html | 36 +-
content/blog/page7/index.html | 36 +-
content/blog/page8/index.html | 38 +-
content/blog/page9/index.html | 38 +-
content/index.html | 8 +-
content/zh/index.html | 8 +-
22 files changed, 1195 insertions(+), 342 deletions(-)
diff --git a/content/2022/01/20/pravega-connector-101.html
b/content/2022/01/20/pravega-connector-101.html
new file mode 100644
index 0000000..54ca238
--- /dev/null
+++ b/content/2022/01/20/pravega-connector-101.html
@@ -0,0 +1,504 @@
+<!DOCTYPE html>
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+ <!-- The above 3 meta tags *must* come first in the head; any other head
content must come *after* these tags -->
+ <title>Apache Flink: Pravega Flink Connector 101</title>
+ <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+ <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+ <!-- Bootstrap -->
+ <link rel="stylesheet" href="/css/bootstrap.min.css">
+ <link rel="stylesheet" href="/css/flink.css">
+ <link rel="stylesheet" href="/css/syntax.css">
+
+ <!-- Blog RSS feed -->
+ <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml"
title="Apache Flink Blog: RSS feed" />
+
+ <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+ <!-- We need to load Jquery in the header for custom google analytics
event tracking-->
+ <script src="/js/jquery.min.js"></script>
+
+ <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media
queries -->
+ <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+ <!--[if lt IE 9]>
+ <script
src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
+ <script
src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
+ <![endif]-->
+ </head>
+ <body>
+
+
+ <!-- Main content. -->
+ <div class="container">
+ <div class="row">
+
+
+ <div id="sidebar" class="col-sm-3">
+
+
+<!-- Top navbar. -->
+ <nav class="navbar navbar-default">
+ <!-- The logo. -->
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle collapsed"
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <div class="navbar-logo">
+ <a href="/">
+ <img alt="Apache Flink" src="/img/flink-header-logo.svg"
width="147px" height="73px">
+ </a>
+ </div>
+ </div><!-- /.navbar-header -->
+
+ <!-- The navigation links. -->
+ <div class="collapse navbar-collapse"
id="bs-example-navbar-collapse-1">
+ <ul class="nav navbar-nav navbar-main">
+
+ <!-- First menu section explains visitors what Flink is -->
+
+ <!-- What is Stream Processing? -->
+ <!--
+ <li><a href="/streamprocessing1.html">What is Stream
Processing?</a></li>
+ -->
+
+ <!-- What is Flink? -->
+ <li><a href="/flink-architecture.html">What is Apache
Flink?</a></li>
+
+
+
+ <!-- What is Stateful Functions? -->
+
+ <li><a href="/stateful-functions.html">What is Stateful
Functions?</a></li>
+
+ <!-- Use cases -->
+ <li><a href="/usecases.html">Use Cases</a></li>
+
+ <!-- Powered by -->
+ <li><a href="/poweredby.html">Powered By</a></li>
+
+
+
+ <!-- Second menu section aims to support Flink users -->
+
+ <!-- Downloads -->
+ <li><a href="/downloads.html">Downloads</a></li>
+
+ <!-- Getting Started -->
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Getting Started<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-release-1.14//docs/try-flink/local_installation/"
target="_blank">With Flink <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1/getting-started/project-setup.html"
target="_blank">With Flink Stateful Functions <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a href="/training.html">Training Course</a></li>
+ </ul>
+ </li>
+
+ <!-- Documentation -->
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Documentation<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-release-1.14"
target="_blank">Flink 1.14 (Latest stable release) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-master"
target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1"
target="_blank">Flink Stateful Functions 3.1 (Latest stable release)
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-master"
target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ </ul>
+ </li>
+
+ <!-- getting help -->
+ <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+ <!-- Blog -->
+ <li><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+ <!-- Flink-packages -->
+ <li>
+ <a href="https://flink-packages.org"
target="_blank">flink-packages.org <small><span class="glyphicon
glyphicon-new-window"></span></small></a>
+ </li>
+
+
+ <!-- Third menu section aim to support community and contributors
-->
+
+ <!-- Community -->
+ <li><a href="/community.html">Community & Project Info</a></li>
+
+ <!-- Roadmap -->
+ <li><a href="/roadmap.html">Roadmap</a></li>
+
+ <!-- Contribute -->
+ <li><a href="/contributing/how-to-contribute.html">How to
Contribute</a></li>
+
+
+ <!-- GitHub -->
+ <li>
+ <a href="https://github.com/apache/flink" target="_blank">Flink
on GitHub <small><span class="glyphicon
glyphicon-new-window"></span></small></a>
+ </li>
+
+
+
+ <!-- Language Switcher -->
+ <li>
+
+
+ <a href="/zh/2022/01/20/pravega-connector-101.html">中文版</a>
+
+
+ </li>
+
+ </ul>
+
+ <style>
+ .smalllinks:link {
+ display: inline-block !important; background: none; padding-top:
0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+ }
+ </style>
+
+ <ul class="nav navbar-nav navbar-bottom">
+ <hr />
+
+ <!-- Twitter -->
+ <li><a href="https://twitter.com/apacheflink"
target="_blank">@ApacheFlink <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <!-- Visualizer -->
+ <li class=" hidden-md hidden-sm"><a href="/visualizer/"
target="_blank">Plan Visualizer <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <li >
+ <a href="/security.html">Flink Security</a>
+ </li>
+
+ <hr />
+
+ <li><a href="https://apache.org" target="_blank">Apache Software
Foundation <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <li>
+
+ <a class="smalllinks" href="https://www.apache.org/licenses/"
target="_blank">License</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks" href="https://www.apache.org/security/"
target="_blank">Security</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks"
href="https://www.apache.org/foundation/sponsorship.html"
target="_blank">Donate</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks"
href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a>
<small><span class="glyphicon glyphicon-new-window"></span></small>
+ </li>
+
+ </ul>
+ </div><!-- /.navbar-collapse -->
+ </nav>
+
+ </div>
+ <div class="col-sm-9">
+ <div class="row-fluid">
+ <div class="col-sm-12">
+ <div class="row">
+ <h1>Pravega Flink Connector 101</h1>
+ <p><i></i></p>
+
+ <article>
+ <p>20 Jan 2022 Yumin Zhou (Brian) (<a
href="https://twitter.com/crazy__zhou">@crazy__zhou</a>)</p>
+
+<p><a href="https://cncf.pravega.io/">Pravega</a>, which is now a CNCF sandbox
project, is a cloud-native storage system based on abstractions for both batch
and streaming data consumption. Pravega streams (a new storage abstraction) are
durable, consistent, and elastic, while natively supporting long-term data
retention. In comparison, <a href="https://flink.apache.org/">Apache Flink</a>
is a popular real-time computing engine that provides unified batch and stream
processing. Flink pro [...]
+
+<p>That’s also the main reason why Pravega has chosen to use Flink as the
first integrated execution engine among the various distributed computing
engines on the market. With the help of Flink, users can use flexible APIs for
windowing, complex event processing (CEP), or table abstractions to process
streaming data easily and enrich the data being stored. Since its inception in
2016, Pravega has established communication with Flink PMC members and
developed the connector together.</p>
+
+<p>In 2017, the Pravega Flink connector module started to move out of the
Pravega main repository and has been maintained in a new separate <a
href="https://github.com/pravega/flink-connectors">repository</a> since then.
During years of development, many features have been implemented, including:</p>
+
+<ul>
+ <li>exactly-once processing guarantees for both Reader and Writer,
supporting end-to-end exactly-once processing pipelines</li>
+ <li>seamless integration with Flink’s checkpoints and savepoints</li>
+ <li>parallel Readers and Writers supporting high throughput and low latency
processing</li>
+ <li>support for Batch, Streaming, and Table API to access Pravega
Streams</li>
+</ul>
+
+<p>These key features make streaming pipeline applications easier to develop
without worrying about performance and correctness which are the common pain
points for many streaming use cases.</p>
+
+<p>In this blog post, we will discuss how to use this connector to read and
write Pravega streams with the Flink DataStream API.</p>
+
+<div class="page-toc">
+<ul id="markdown-toc">
+ <li><a href="#basic-usages" id="markdown-toc-basic-usages">Basic usages</a>
<ul>
+ <li><a href="#dependency"
id="markdown-toc-dependency">Dependency</a></li>
+ <li><a href="#api-introduction" id="markdown-toc-api-introduction">API
introduction</a> <ul>
+ <li><a href="#configurations"
id="markdown-toc-configurations">Configurations</a></li>
+ <li><a href="#serializationdeserialization"
id="markdown-toc-serializationdeserialization">Serialization/Deserialization</a></li>
+ <li><a href="#flinkpravegareader"
id="markdown-toc-flinkpravegareader"><code>FlinkPravegaReader</code></a></li>
+ <li><a href="#flinkpravegawriter"
id="markdown-toc-flinkpravegawriter"><code>FlinkPravegaWriter</code></a></li>
+ </ul>
+ </li>
+ </ul>
+ </li>
+ <li><a href="#internals-of-reader-and-writer"
id="markdown-toc-internals-of-reader-and-writer">Internals of reader and
writer</a> <ul>
+ <li><a href="#checkpoint-integration"
id="markdown-toc-checkpoint-integration">Checkpoint integration</a></li>
+ <li><a href="#end-to-end-exactly-once-semantics"
id="markdown-toc-end-to-end-exactly-once-semantics">End-to-end exactly-once
semantics</a></li>
+ </ul>
+ </li>
+ <li><a href="#summary" id="markdown-toc-summary">Summary</a></li>
+ <li><a href="#future-plans" id="markdown-toc-future-plans">Future
plans</a></li>
+</ul>
+
+</div>
+
+<h1 id="basic-usages">Basic usages</h1>
+
+<h2 id="dependency">Dependency</h2>
+<p>To use this connector in your application, add the dependency to your
project:</p>
+
+<div class="highlight"><pre><code class="language-xml"><span
class="nt"><dependency></span>
+ <span class="nt"><groupId></span>io.pravega<span
class="nt"></groupId></span>
+ <span
class="nt"><artifactId></span>pravega-connectors-flink-1.13_2.12<span
class="nt"></artifactId></span>
+ <span class="nt"><version></span>0.10.1<span
class="nt"></version></span>
+<span class="nt"></dependency></span></code></pre></div>
+
+<p>In the above example,</p>
+
+<p><code>1.13</code> is the Flink major version which is put in the middle of
the artifact name. The Pravega Flink connector maintains compatibility for the
<em>three</em> most recent major versions of Flink.</p>
+
+<p><code>0.10.1</code> is the version that aligns with the Pravega version.</p>
+
+<p>You can find the latest release with a support matrix on the <a
href="https://github.com/pravega/flink-connectors/releases">GitHub Releases
page</a>.</p>
+
+<h2 id="api-introduction">API introduction</h2>
+
+<h3 id="configurations">Configurations</h3>
+
+<p>The connector provides a common top-level object <code>PravegaConfig</code>
for Pravega connection configurations. The config object automatically
configures itself from <em>environment variables</em>, <em>system
properties</em> and <em>program arguments</em>.</p>
+
+<p>The basic controller URI and the default scope can be set like this:</p>
+
+<table>
+ <thead>
+ <tr>
+ <th>Setting</th>
+ <th>Environment Variable /<br />System Property /<br />Program
Argument</th>
+ <th>Default Value</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Controller URI</td>
+ <td><code>PRAVEGA_CONTROLLER_URI</code><br
/><code>pravega.controller.uri</code><br /><code>--controller</code></td>
+ <td><code>tcp://localhost:9090</code></td>
+ </tr>
+ <tr>
+ <td>Default Scope</td>
+ <td><code>PRAVEGA_SCOPE</code><br /><code>pravega.scope</code><br
/><code>--scope</code></td>
+ <td>-</td>
+ </tr>
+ </tbody>
+</table>
+
+<p>The recommended way to create an instance of <code>PravegaConfig</code> is
the following:</p>
+
+<div class="highlight"><pre><code class="language-java"><span class="c1">//
From default environment</span>
+<span class="n">PravegaConfig</span> <span class="n">config</span> <span
class="o">=</span> <span class="n">PravegaConfig</span><span
class="o">.</span><span class="na">fromDefaults</span><span class="o">();</span>
+
+<span class="c1">// From program arguments</span>
+<span class="n">ParameterTool</span> <span class="n">params</span> <span
class="o">=</span> <span class="n">ParameterTool</span><span
class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span
class="n">args</span><span class="o">);</span>
+<span class="n">PravegaConfig</span> <span class="n">config</span> <span
class="o">=</span> <span class="n">PravegaConfig</span><span
class="o">.</span><span class="na">fromParams</span><span
class="o">(</span><span class="n">params</span><span class="o">);</span>
+
+<span class="c1">// From user specification</span>
+<span class="n">PravegaConfig</span> <span class="n">config</span> <span
class="o">=</span> <span class="n">PravegaConfig</span><span
class="o">.</span><span class="na">fromDefaults</span><span class="o">()</span>
+ <span class="o">.</span><span class="na">withControllerURI</span><span
class="o">(</span><span class="s">"tcp://..."</span><span
class="o">)</span>
+ <span class="o">.</span><span class="na">withDefaultScope</span><span
class="o">(</span><span class="s">"SCOPE-NAME"</span><span
class="o">)</span>
+ <span class="o">.</span><span class="na">withCredentials</span><span
class="o">(</span><span class="n">credentials</span><span class="o">)</span>
+ <span class="o">.</span><span
class="na">withHostnameValidation</span><span class="o">(</span><span
class="kc">false</span><span class="o">);</span></code></pre></div>
+
+<h3 id="serializationdeserialization">Serialization/Deserialization</h3>
+
+<p>Pravega has defined <a
href="http://pravega.io/docs/latest/javadoc/clients/io/pravega/client/stream/Serializer.html"><code>io.pravega.client.stream.Serializer</code></a>
for the serialization/deserialization, while Flink has also defined standard
interfaces for the purpose.</p>
+
+<ul>
+ <li><a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/SerializationSchema.html"><code>org.apache.flink.api.common.serialization.SerializationSchema</code></a></li>
+ <li><a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/DeserializationSchema.html"><code>org.apache.flink.api.common.serialization.DeserializationSchema</code></a></li>
+</ul>
+
+<p>For interoperability with other pravega client applications, we have
built-in adapters <code>PravegaSerializationSchema</code> and
<code>PravegaDeserializationSchema</code> to support processing Pravega stream
data produced by a non-Flink application.</p>
+
+<p>Here is the adapter for Pravega Java serializer:</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kn">import</span> <span
class="nn">io.pravega.client.stream.impl.JavaSerializer</span><span
class="o">;</span>
+<span class="o">...</span>
+<span class="n">DeserializationSchema</span><span class="o"><</span><span
class="n">MyEvent</span><span class="o">></span> <span
class="n">adapter</span> <span class="o">=</span> <span class="k">new</span>
<span class="n">PravegaDeserializationSchema</span><span
class="o"><>(</span>
+ <span class="n">MyEvent</span><span class="o">.</span><span
class="na">class</span><span class="o">,</span> <span class="k">new</span>
<span class="n">JavaSerializer</span><span class="o"><</span><span
class="n">MyEvent</span><span class="o">>());</span></code></pre></div>
+
+<h3 id="flinkpravegareader"><code>FlinkPravegaReader</code></h3>
+
+<p><code>FlinkPravegaReader</code> is a Flink <code>SourceFunction</code>
implementation which supports parallel reads from one or more Pravega streams.
Internally, it initiates a Pravega reader group and creates Pravega
<code>EventStreamReader</code> instances to read the data from the stream(s).
It provides a builder-style API to construct, and can allow streamcuts to mark
the start and end of the read.</p>
+
+<p>You can use it like this:</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">final</span> <span class="n">StreamExecutionEnvironment</span> <span
class="n">env</span> <span class="o">=</span> <span
class="n">StreamExecutionEnvironment</span><span class="o">.</span><span
class="na">getExecutionEnvironment</span><span class="o">();</span>
+
+<span class="c1">// Enable Flink checkpoint to make state fault tolerant</span>
+<span class="n">env</span><span class="o">.</span><span
class="na">enableCheckpointing</span><span class="o">(</span><span
class="mi">60000</span><span class="o">);</span>
+
+<span class="c1">// Define the Pravega configuration</span>
+<span class="n">ParameterTool</span> <span class="n">params</span> <span
class="o">=</span> <span class="n">ParameterTool</span><span
class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span
class="n">args</span><span class="o">);</span>
+<span class="n">PravegaConfig</span> <span class="n">config</span> <span
class="o">=</span> <span class="n">PravegaConfig</span><span
class="o">.</span><span class="na">fromParams</span><span
class="o">(</span><span class="n">params</span><span class="o">);</span>
+
+<span class="c1">// Define the event deserializer</span>
+<span class="n">DeserializationSchema</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">deserializer</span> <span class="o">=</span> <span
class="o">...</span>
+
+<span class="c1">// Define the data stream</span>
+<span class="n">FlinkPravegaReader</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">pravegaSource</span> <span class="o">=</span> <span
class="n">FlinkPravegaReader</span><span class="o">.<</span><span
class="n">MyClass</span><span class="o">></span><span
class="n">builder</span><span class="o">()</span>
+ <span class="o">.</span><span class="na">forStream</span><span
class="o">(...)</span>
+ <span class="o">.</span><span class="na">withPravegaConfig</span><span
class="o">(</span><span class="n">config</span><span class="o">)</span>
+ <span class="o">.</span><span
class="na">withDeserializationSchema</span><span class="o">(</span><span
class="n">deserializer</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">build</span><span
class="o">();</span>
+<span class="n">DataStream</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">stream</span> <span class="o">=</span> <span
class="n">env</span><span class="o">.</span><span
class="na">addSource</span><span class="o">(</span><span
class="n">pravegaSource</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">setParallelism</span><span
class="o">(</span><span class="mi">4</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">uid</span><span
class="o">(</span><span class="s">"pravega-source"</span><span
class="o">);</span></code></pre></div>
+
+<h3 id="flinkpravegawriter"><code>FlinkPravegaWriter</code></h3>
+
+<p><code>FlinkPravegaWriter</code> is a Flink <code>SinkFunction</code>
implementation which supports parallel writes to Pravega streams.</p>
+
+<p>It supports three writer modes that relate to guarantees about the
persistence of events emitted by the sink to a Pravega Stream:</p>
+
+<ol>
+ <li><strong>Best-effort</strong> - Any write failures will be ignored and
there could be data loss.</li>
+ <li><strong>At-least-once</strong>(default) - All events are persisted in
Pravega. Duplicate events are possible, due to retries or in case of failure
and subsequent recovery.</li>
+ <li><strong>Exactly-once</strong> - All events are persisted in Pravega
using a transactional approach integrated with the Flink checkpointing
feature.</li>
+</ol>
+
+<p>Internally, it will initiate several Pravega <code>EventStreamWriter</code>
or <code>TransactionalEventStreamWriter</code> (depends on the writer mode)
instances to write data to the stream. It provides a builder-style API to
construct.</p>
+
+<p>A basic usage looks like this:</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span
class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span
class="o">.</span><span class="na">getExecutionEnvironment</span><span
class="o">();</span>
+
+<span class="c1">// Define the Pravega configuration</span>
+<span class="n">PravegaConfig</span> <span class="n">config</span> <span
class="o">=</span> <span class="n">PravegaConfig</span><span
class="o">.</span><span class="na">fromParams</span><span
class="o">(</span><span class="n">params</span><span class="o">);</span>
+
+<span class="c1">// Define the event serializer</span>
+<span class="n">SerializationSchema</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">serializer</span> <span class="o">=</span> <span class="o">...</span>
+
+<span class="c1">// Define the event router for selecting the Routing
Key</span>
+<span class="n">PravegaEventRouter</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">router</span> <span class="o">=</span> <span class="o">...</span>
+
+<span class="c1">// Define the sink function</span>
+<span class="n">FlinkPravegaWriter</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">pravegaSink</span> <span class="o">=</span> <span
class="n">FlinkPravegaWriter</span><span class="o">.<</span><span
class="n">MyClass</span><span class="o">></span><span
class="n">builder</span><span class="o">()</span>
+ <span class="o">.</span><span class="na">forStream</span><span
class="o">(...)</span>
+ <span class="o">.</span><span class="na">withPravegaConfig</span><span
class="o">(</span><span class="n">config</span><span class="o">)</span>
+ <span class="o">.</span><span
class="na">withSerializationSchema</span><span class="o">(</span><span
class="n">serializer</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">withEventRouter</span><span
class="o">(</span><span class="n">router</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">withWriterMode</span><span
class="o">(</span><span class="n">EXACTLY_ONCE</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">build</span><span
class="o">();</span>
+
+<span class="n">DataStream</span><span class="o"><</span><span
class="n">MyClass</span><span class="o">></span> <span
class="n">stream</span> <span class="o">=</span> <span class="o">...</span>
+<span class="n">stream</span><span class="o">.</span><span
class="na">addSink</span><span class="o">(</span><span
class="n">pravegaSink</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">setParallelism</span><span
class="o">(</span><span class="mi">4</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">uid</span><span
class="o">(</span><span class="s">"pravega-sink"</span><span
class="o">);</span></code></pre></div>
+
+<p>You can see some more examples <a
href="https://github.com/pravega/pravega-samples">here</a>.</p>
+
+<h1 id="internals-of-reader-and-writer">Internals of reader and writer</h1>
+
+<h2 id="checkpoint-integration">Checkpoint integration</h2>
+
+<p>Flink has periodic checkpoints based on the Chandy-Lamport algorithm to
make state in Flink fault-tolerant. By allowing state and the corresponding
stream positions to be recovered, the application is given the same semantics
as a failure-free execution.</p>
+
+<p>Pravega also has its own Checkpoint concept which is to create a consistent
“point in time” persistence of the state of each Reader in the Reader Group, by
using a specialized Event (<em>Checkpoint Event</em>) to signal each Reader to
preserve its state. Once a Checkpoint has been completed, the application can
use the Checkpoint to reset all the Readers in the Reader Group to the known
consistent state represented by the Checkpoint.</p>
+
+<p>This means that our end-to-end recovery story is not like other messaging
systems such as Kafka, which uses a more coupled method and persists its offset
in the Flink task state and lets Flink do the coordination. Flink delegates the
Pravega source recovery completely to the Pravega server and uses only a
lightweight hook to connect. We collaborated with the Flink community and added
a new interface <code>ExternallyInducedSource</code> (<a
href="https://issues.apache.org/jira/browse/F [...]
+
+<p>The checkpoint mechanism works as a two-step process:</p>
+
+<ul>
+ <li>
+ <p>The <a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html">master
hook</a> handler from the JobManager initiates the <a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html#triggerCheckpoint-long-long-java.util.concurrent.Executor-"><code>triggerCheckpoint</code></a>
request to the <code>ReaderCheckpointHook</code> [...]
+ </li>
+ <li>
+ <p>A <code>Checkpoint</code> event will be sent by Pravega as part of the
data stream flow and, upon receiving the event, the
<code>FlinkPravegaReader</code> will initiate a <a
href="https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/checkpoint/ExternallyInducedSource.java#L73"><code>triggerCheckpoint</code></a>
request to effectively let Flink continue and complete the checkpoint
process.</p>
+ </li>
+</ul>
+
+<h2 id="end-to-end-exactly-once-semantics">End-to-end exactly-once
semantics</h2>
+
+<p>In the early years of big data processing, results from real-time stream
processing were always considered inaccurate/approximate/speculative. However,
this correctness is extremely important for some use cases and in some
industries such as finance.</p>
+
+<p>This constraint stems mainly from two issues:</p>
+
+<ul>
+ <li>unordered data source in event time</li>
+ <li>end-to-end exactly-once semantics guarantee</li>
+</ul>
+
+<p>During recent years of development, watermarking has been introduced as a
tradeoff between correctness and latency, which is now considered a good
solution for unordered data sources in event time.</p>
+
+<p>The guarantee of end-to-end exactly-once semantics is more tricky. When we
say “exactly-once semantics”, what we mean is that each incoming event affects
the final results exactly once. Even in the event of a machine or software
failure, there is no duplicate data and no data that goes unprocessed. This is
quite difficult because of the demands of message acknowledgment and recovery
during such fast processing and is also why some early distributed streaming
engines like Storm(without [...]
+
+<p>Flink is one of the first streaming systems that was able to provide
exactly-once semantics due to its delicate <a
href="https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink">checkpoint
mechanism</a>. But to make it work end-to-end, the final stage needs to apply
the semantic to external message system sinks that support commits and
rollbacks.</p>
+
+<p>To work around this problem, Pravega introduced <a
href="https://cncf.pravega.io/docs/latest/transactions/">transactional
writes</a>. A Pravega transaction allows an application to prepare a set of
events that can be written “all at once” to a Stream. This allows an
application to “commit” a bunch of events atomically. When writes are
idempotent, it is possible to implement end-to-end exactly-once pipelines
together with Flink.</p>
+
+<p>To build such an end-to-end solution requires coordination between Flink
and the Pravega sink, which is still challenging. A common approach for
coordinating commits and rollbacks in a distributed system is the two-phase
commit protocol. We used this protocol and, together with the Flink community,
implemented the sink function in a two-phase commit way coordinated with Flink
checkpoints.</p>
+
+<p>The Flink community then extracted the common logic from the two-phase
commit protocol and provided a general interface
<code>TwoPhaseCommitSinkFunction</code> (<a
href="https://issues.apache.org/jira/browse/FLINK-7210">FLINK-7210</a>) to make
it possible to build end-to-end exactly-once applications with other message
systems that have transaction support. This includes Apache Kafka versions 0.11
and above. There is an official Flink <a
href="https://flink.apache.org/features/2018/03 [...]
+
+<h1 id="summary">Summary</h1>
+<p>The Pravega Flink connector enables Pravega to connect to Flink and allows
Pravega to act as a key data store in a streaming pipeline. Both projects share
a common design philosophy and can integrate well with each other. Pravega has
its own concept of checkpointing and has implemented transactional writes to
support end-to-end exactly-once guarantees.</p>
+
+<h1 id="future-plans">Future plans</h1>
+
+<p><code>FlinkPravegaInputFormat</code> and
<code>FlinkPravegaOutputFormat</code> are now provided to support batch reads
and writes in Flink, but these are under the legacy DataSet API. Since Flink is
now making efforts to unify batch and streaming, it is improving its APIs and
providing new interfaces for the <a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface">source</a>
and <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-143 [...]
+
+<p>We will also put more effort into SQL / Table API support in order to
provide a better user experience since it is simpler to understand and even
more powerful to use in some cases.</p>
+
+<p><strong>Note:</strong> the original blog post can be found <a
href="https://cncf.pravega.io/blog/2021/11/01/pravega-flink-connector-101/">here</a>.</p>
+
+ </article>
+ </div>
+
+ <div class="row">
+ <div id="disqus_thread"></div>
+ <script type="text/javascript">
+ /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE
* * */
+ var disqus_shortname = 'stratosphere-eu'; // required: replace example
with your forum shortname
+
+ /* * * DON'T EDIT BELOW THIS LINE * * */
+ (function() {
+ var dsq = document.createElement('script'); dsq.type =
'text/javascript'; dsq.async = true;
+ dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+ (document.getElementsByTagName('head')[0] ||
document.getElementsByTagName('body')[0]).appendChild(dsq);
+ })();
+ </script>
+ </div>
+ </div>
+</div>
+ </div>
+ </div>
+
+ <hr />
+
+ <div class="row">
+ <div class="footer text-center col-sm-12">
+ <p>Copyright © 2014-2021 <a href="http://apache.org">The Apache
Software Foundation</a>. All Rights Reserved.</p>
+ <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache
feather logo are either registered trademarks or trademarks of The Apache
Software Foundation.</p>
+ <p><a href="/privacy-policy.html">Privacy Policy</a> · <a
href="/blog/feed.xml">RSS feed</a></p>
+ </div>
+ </div>
+ </div><!-- /.container -->
+
+ <!-- Include all compiled plugins (below), or include individual files as
needed -->
+ <script src="/js/jquery.matchHeight-min.js"></script>
+ <script src="/js/bootstrap.min.js"></script>
+ <script src="/js/codetabs.js"></script>
+ <script src="/js/stickysidebar.js"></script>
+
+ <!-- Google Analytics -->
+ <script>
+
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new
Date();a=s.createElement(o),
+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+ ga('create', 'UA-52545728-1', 'auto');
+ ga('send', 'pageview');
+ </script>
+ </body>
+</html>
diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 24fd2a2..de45911 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,263 @@
<atom:link href="https://flink.apache.org/blog/feed.xml" rel="self"
type="application/rss+xml" />
<item>
+<title>Pravega Flink Connector 101</title>
+<description><p><a
href="https://cncf.pravega.io/">Pravega</a>, which is now a
CNCF sandbox project, is a cloud-native storage system based on abstractions
for both batch and streaming data consumption. Pravega streams (a new storage
abstraction) are durable, consistent, and elastic, while natively supporting
long-term data retention. In comparison, <a
href="https://flink.apache.org/">Apache Flink</a> is a
popular real-time computing engi [...]
+
+<p>That’s also the main reason why Pravega has chosen to use Flink as
the first integrated execution engine among the various distributed computing
engines on the market. With the help of Flink, users can use flexible APIs for
windowing, complex event processing (CEP), or table abstractions to process
streaming data easily and enrich the data being stored. Since its inception in
2016, Pravega has established communication with Flink PMC members and
developed the connector together. [...]
+
+<p>In 2017, the Pravega Flink connector module started to move out of
the Pravega main repository and has been maintained in a new separate <a
href="https://github.com/pravega/flink-connectors">repository</a>
since then. During years of development, many features have been implemented,
including:</p>
+
+<ul>
+ <li>exactly-once processing guarantees for both Reader and Writer,
supporting end-to-end exactly-once processing pipelines</li>
+ <li>seamless integration with Flink’s checkpoints and
savepoints</li>
+ <li>parallel Readers and Writers supporting high throughput and low
latency processing</li>
+ <li>support for Batch, Streaming, and Table API to access Pravega
Streams</li>
+</ul>
+
+<p>These key features make streaming pipeline applications easier to
develop without worrying about performance and correctness which are the common
pain points for many streaming use cases.</p>
+
+<p>In this blog post, we will discuss how to use this connector to read
and write Pravega streams with the Flink DataStream API.</p>
+
+<div class="page-toc">
+<ul id="markdown-toc">
+ <li><a href="#basic-usages"
id="markdown-toc-basic-usages">Basic usages</a> <ul>
+ <li><a href="#dependency"
id="markdown-toc-dependency">Dependency</a></li>
+ <li><a href="#api-introduction"
id="markdown-toc-api-introduction">API introduction</a>
<ul>
+ <li><a href="#configurations"
id="markdown-toc-configurations">Configurations</a></li>
+ <li><a href="#serializationdeserialization"
id="markdown-toc-serializationdeserialization">Serialization/Deserialization</a></li>
+ <li><a href="#flinkpravegareader"
id="markdown-toc-flinkpravegareader"><code>FlinkPravegaReader</code></a></li>
+ <li><a href="#flinkpravegawriter"
id="markdown-toc-flinkpravegawriter"><code>FlinkPravegaWriter</code></a></li>
+ </ul>
+ </li>
+ </ul>
+ </li>
+ <li><a href="#internals-of-reader-and-writer"
id="markdown-toc-internals-of-reader-and-writer">Internals of
reader and writer</a> <ul>
+ <li><a href="#checkpoint-integration"
id="markdown-toc-checkpoint-integration">Checkpoint
integration</a></li>
+ <li><a href="#end-to-end-exactly-once-semantics"
id="markdown-toc-end-to-end-exactly-once-semantics">End-to-end
exactly-once semantics</a></li>
+ </ul>
+ </li>
+ <li><a href="#summary"
id="markdown-toc-summary">Summary</a></li>
+ <li><a href="#future-plans"
id="markdown-toc-future-plans">Future plans</a></li>
+</ul>
+
+</div>
+
+<h1 id="basic-usages">Basic usages</h1>
+
+<h2 id="dependency">Dependency</h2>
+<p>To use this connector in your application, add the dependency to your
project:</p>
+
+<div class="highlight"><pre><code
class="language-xml"><span
class="nt">&lt;dependency&gt;</span>
+ <span
class="nt">&lt;groupId&gt;</span>io.pravega<span
class="nt">&lt;/groupId&gt;</span>
+ <span
class="nt">&lt;artifactId&gt;</span>pravega-connectors-flink-1.13_2.12<span
class="nt">&lt;/artifactId&gt;</span>
+ <span
class="nt">&lt;version&gt;</span>0.10.1<span
class="nt">&lt;/version&gt;</span>
+<span
class="nt">&lt;/dependency&gt;</span></code></pre></div>
+
+<p>In the above example,</p>
+
+<p><code>1.13</code> is the Flink major version which is put
in the middle of the artifact name. The Pravega Flink connector maintains
compatibility for the <em>three</em> most recent major versions of
Flink.</p>
+
+<p><code>0.10.1</code> is the version that aligns with the
Pravega version.</p>
+
+<p>You can find the latest release with a support matrix on the <a
href="https://github.com/pravega/flink-connectors/releases">GitHub
Releases page</a>.</p>
+
+<h2 id="api-introduction">API introduction</h2>
+
+<h3 id="configurations">Configurations</h3>
+
+<p>The connector provides a common top-level object
<code>PravegaConfig</code> for Pravega connection configurations.
The config object automatically configures itself from <em>environment
variables</em>, <em>system properties</em> and
<em>program arguments</em>.</p>
+
+<p>The basic controller URI and the default scope can be set like
this:</p>
+
+<table>
+ <thead>
+ <tr>
+ <th>Setting</th>
+ <th>Environment Variable /<br />System Property /<br
/>Program Argument</th>
+ <th>Default Value</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Controller URI</td>
+ <td><code>PRAVEGA_CONTROLLER_URI</code><br
/><code>pravega.controller.uri</code><br
/><code>--controller</code></td>
+ <td><code>tcp://localhost:9090</code></td>
+ </tr>
+ <tr>
+ <td>Default Scope</td>
+ <td><code>PRAVEGA_SCOPE</code><br
/><code>pravega.scope</code><br
/><code>--scope</code></td>
+ <td>-</td>
+ </tr>
+ </tbody>
+</table>
+
+<p>The recommended way to create an instance of
<code>PravegaConfig</code> is the following:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span class="c1">// From
default environment</span>
+<span class="n">PravegaConfig</span> <span
class="n">config</span> <span
class="o">=</span> <span
class="n">PravegaConfig</span><span
class="o">.</span><span
class="na">fromDefaults</span><span
class="o">();</span>
+
+<span class="c1">// From program arguments</span>
+<span class="n">ParameterTool</span> <span
class="n">params</span> <span
class="o">=</span> <span
class="n">ParameterTool</span><span
class="o">.</span><span
class="na">fromArgs</span><span
class="o">(</span><span
class="n">args</span><span
class="o">);</span>
+<span class="n">PravegaConfig</span> <span
class="n">config</span> <span
class="o">=</span> <span
class="n">PravegaConfig</span><span
class="o">.</span><span
class="na">fromParams</span><span
class="o">(</span><span
class="n">params</span><span
class="o">);</span>
+
+<span class="c1">// From user specification</span>
+<span class="n">PravegaConfig</span> <span
class="n">config</span> <span
class="o">=</span> <span
class="n">PravegaConfig</span><span
class="o">.</span><span
class="na">fromDefaults</span><span
class="o">()</span>
+ <span class="o">.</span><span
class="na">withControllerURI</span><span
class="o">(</span><span
class="s">&quot;tcp://...&quot;</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withDefaultScope</span><span
class="o">(</span><span
class="s">&quot;SCOPE-NAME&quot;</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withCredentials</span><span
class="o">(</span><span
class="n">credentials</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withHostnameValidation</span><span
class="o">(</span><span
class="kc">false</span><span
class="o">);</span></code></pre></div>
+
+<h3
id="serializationdeserialization">Serialization/Deserialization</h3>
+
+<p>Pravega has defined <a
href="http://pravega.io/docs/latest/javadoc/clients/io/pravega/client/stream/Serializer.html"><code>io.pravega.client.stream.Serializer</code></a>
for the serialization/deserialization, while Flink has also defined standard
interfaces for the purpose.</p>
+
+<ul>
+ <li><a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/SerializationSchema.html"><code>org.apache.flink.api.common.serialization.SerializationSchema</code></a></li>
+ <li><a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/DeserializationSchema.html"><code>org.apache.flink.api.common.serialization.DeserializationSchema</code></a></li>
+</ul>
+
+<p>For interoperability with other pravega client applications, we have
built-in adapters <code>PravegaSerializationSchema</code> and
<code>PravegaDeserializationSchema</code> to support processing
Pravega stream data produced by a non-Flink application.</p>
+
+<p>Here is the adapter for Pravega Java serializer:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kn">import</span> <span
class="nn">io.pravega.client.stream.impl.JavaSerializer</span><span
class="o">;</span>
+<span class="o">...</span>
+<span class="n">DeserializationSchema</span><span
class="o">&lt;</span><span
class="n">MyEvent</span><span
class="o">&gt;</span> <span
class="n">adapter</span> <span
class="o">=</span> <span
class="k">new</span> <span
class="n">PravegaDeserializationSchema</span><span
class="o">&lt;& [...]
+ <span class="n">MyEvent</span><span
class="o">.</span><span
class="na">class</span><span
class="o">,</span> <span
class="k">new</span> <span
class="n">JavaSerializer</span><span
class="o">&lt;</span><span
class="n">MyEvent</span><span
class="o">&gt;());</span></code></pre& [...]
+
+<h3
id="flinkpravegareader"><code>FlinkPravegaReader</code></h3>
+
+<p><code>FlinkPravegaReader</code> is a Flink
<code>SourceFunction</code> implementation which supports parallel
reads from one or more Pravega streams. Internally, it initiates a Pravega
reader group and creates Pravega <code>EventStreamReader</code>
instances to read the data from the stream(s). It provides a builder-style API
to construct, and can allow streamcuts to mark the start and end of the
read.</p>
+
+<p>You can use it like this:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">final</span> <span
class="n">StreamExecutionEnvironment</span> <span
class="n">env</span> <span
class="o">=</span> <span
class="n">StreamExecutionEnvironment</span><span
class="o">.</span><span
class="na">getExecutionEnvironment</ [...]
+
+<span class="c1">// Enable Flink checkpoint to make state
fault tolerant</span>
+<span class="n">env</span><span
class="o">.</span><span
class="na">enableCheckpointing</span><span
class="o">(</span><span
class="mi">60000</span><span
class="o">);</span>
+
+<span class="c1">// Define the Pravega
configuration</span>
+<span class="n">ParameterTool</span> <span
class="n">params</span> <span
class="o">=</span> <span
class="n">ParameterTool</span><span
class="o">.</span><span
class="na">fromArgs</span><span
class="o">(</span><span
class="n">args</span><span
class="o">);</span>
+<span class="n">PravegaConfig</span> <span
class="n">config</span> <span
class="o">=</span> <span
class="n">PravegaConfig</span><span
class="o">.</span><span
class="na">fromParams</span><span
class="o">(</span><span
class="n">params</span><span
class="o">);</span>
+
+<span class="c1">// Define the event deserializer</span>
+<span class="n">DeserializationSchema</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">deserializer</span> <span
class="o">=</span> <span
class="o">...</span>
+
+<span class="c1">// Define the data stream</span>
+<span class="n">FlinkPravegaReader</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">pravegaSource</span> <span
class="o">=</span> <span
class="n">FlinkPravegaReader</span><span
class="o">.&lt;</span><span
class="n">MyClass</spa [...]
+ <span class="o">.</span><span
class="na">forStream</span><span
class="o">(...)</span>
+ <span class="o">.</span><span
class="na">withPravegaConfig</span><span
class="o">(</span><span
class="n">config</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withDeserializationSchema</span><span
class="o">(</span><span
class="n">deserializer</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">build</span><span
class="o">();</span>
+<span class="n">DataStream</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">stream</span> <span
class="o">=</span> <span
class="n">env</span><span
class="o">.</span><span
class="na">addSource</span><span class="o"&g
[...]
+ <span class="o">.</span><span
class="na">setParallelism</span><span
class="o">(</span><span
class="mi">4</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">uid</span><span
class="o">(</span><span
class="s">&quot;pravega-source&quot;</span><span
class="o">);</span></code></pre></div>
+
+<h3
id="flinkpravegawriter"><code>FlinkPravegaWriter</code></h3>
+
+<p><code>FlinkPravegaWriter</code> is a Flink
<code>SinkFunction</code> implementation which supports parallel
writes to Pravega streams.</p>
+
+<p>It supports three writer modes that relate to guarantees about the
persistence of events emitted by the sink to a Pravega Stream:</p>
+
+<ol>
+ <li><strong>Best-effort</strong> - Any write failures will
be ignored and there could be data loss.</li>
+ <li><strong>At-least-once</strong>(default) - All events
are persisted in Pravega. Duplicate events are possible, due to retries or in
case of failure and subsequent recovery.</li>
+ <li><strong>Exactly-once</strong> - All events are
persisted in Pravega using a transactional approach integrated with the Flink
checkpointing feature.</li>
+</ol>
+
+<p>Internally, it will initiate several Pravega
<code>EventStreamWriter</code> or
<code>TransactionalEventStreamWriter</code> (depends on the writer
mode) instances to write data to the stream. It provides a builder-style API to
construct.</p>
+
+<p>A basic usage looks like this:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="n">StreamExecutionEnvironment</span> <span
class="n">env</span> <span
class="o">=</span> <span
class="n">StreamExecutionEnvironment</span><span
class="o">.</span><span
class="na">getExecutionEnvironment</span><span
class="o">();</span>
+
+<span class="c1">// Define the Pravega
configuration</span>
+<span class="n">PravegaConfig</span> <span
class="n">config</span> <span
class="o">=</span> <span
class="n">PravegaConfig</span><span
class="o">.</span><span
class="na">fromParams</span><span
class="o">(</span><span
class="n">params</span><span
class="o">);</span>
+
+<span class="c1">// Define the event serializer</span>
+<span class="n">SerializationSchema</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">serializer</span> <span
class="o">=</span> <span
class="o">...</span>
+
+<span class="c1">// Define the event router for selecting the
Routing Key</span>
+<span class="n">PravegaEventRouter</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">router</span> <span
class="o">=</span> <span
class="o">...</span>
+
+<span class="c1">// Define the sink function</span>
+<span class="n">FlinkPravegaWriter</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">pravegaSink</span> <span
class="o">=</span> <span
class="n">FlinkPravegaWriter</span><span
class="o">.&lt;</span><span
class="n">MyClass</span& [...]
+ <span class="o">.</span><span
class="na">forStream</span><span
class="o">(...)</span>
+ <span class="o">.</span><span
class="na">withPravegaConfig</span><span
class="o">(</span><span
class="n">config</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withSerializationSchema</span><span
class="o">(</span><span
class="n">serializer</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withEventRouter</span><span
class="o">(</span><span
class="n">router</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">withWriterMode</span><span
class="o">(</span><span
class="n">EXACTLY_ONCE</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">build</span><span
class="o">();</span>
+
+<span class="n">DataStream</span><span
class="o">&lt;</span><span
class="n">MyClass</span><span
class="o">&gt;</span> <span
class="n">stream</span> <span
class="o">=</span> <span
class="o">...</span>
+<span class="n">stream</span><span
class="o">.</span><span
class="na">addSink</span><span
class="o">(</span><span
class="n">pravegaSink</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">setParallelism</span><span
class="o">(</span><span
class="mi">4</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">uid</span><span
class="o">(</span><span
class="s">&quot;pravega-sink&quot;</span><span
class="o">);</span></code></pre></div>
+
+<p>You can see some more examples <a
href="https://github.com/pravega/pravega-samples">here</a>.</p>
+
+<h1 id="internals-of-reader-and-writer">Internals of reader
and writer</h1>
+
+<h2 id="checkpoint-integration">Checkpoint
integration</h2>
+
+<p>Flink has periodic checkpoints based on the Chandy-Lamport algorithm
to make state in Flink fault-tolerant. By allowing state and the corresponding
stream positions to be recovered, the application is given the same semantics
as a failure-free execution.</p>
+
+<p>Pravega also has its own Checkpoint concept which is to create a
consistent “point in time” persistence of the state of each Reader in the
Reader Group, by using a specialized Event (<em>Checkpoint
Event</em>) to signal each Reader to preserve its state. Once a
Checkpoint has been completed, the application can use the Checkpoint to reset
all the Readers in the Reader Group to the known consistent state represented
by the Checkpoint.</p>
+
+<p>This means that our end-to-end recovery story is not like other
messaging systems such as Kafka, which uses a more coupled method and persists
its offset in the Flink task state and lets Flink do the coordination. Flink
delegates the Pravega source recovery completely to the Pravega server and uses
only a lightweight hook to connect. We collaborated with the Flink community
and added a new interface <code>ExternallyInducedSource</code>
(<a href="https://issue [...]
+
+<p>The checkpoint mechanism works as a two-step process:</p>
+
+<ul>
+ <li>
+ <p>The <a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html">master
hook</a> handler from the JobManager initiates the <a
href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html#triggerCheckpoint-long-long-java.util.concurrent.Executor-"><code>triggerCheckpoint</code&g
[...]
+ </li>
+ <li>
+ <p>A <code>Checkpoint</code> event will be sent by
Pravega as part of the data stream flow and, upon receiving the event, the
<code>FlinkPravegaReader</code> will initiate a <a
href="https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/checkpoint/ExternallyInducedSource.java#L73"><code>triggerCheckpoint</code></a>
request to effectively let Flink continue and compl [...]
+ </li>
+</ul>
+
+<h2 id="end-to-end-exactly-once-semantics">End-to-end
exactly-once semantics</h2>
+
+<p>In the early years of big data processing, results from real-time
stream processing were always considered inaccurate/approximate/speculative.
However, this correctness is extremely important for some use cases and in some
industries such as finance.</p>
+
+<p>This constraint stems mainly from two issues:</p>
+
+<ul>
+ <li>unordered data source in event time</li>
+ <li>end-to-end exactly-once semantics guarantee</li>
+</ul>
+
+<p>During recent years of development, watermarking has been introduced
as a tradeoff between correctness and latency, which is now considered a good
solution for unordered data sources in event time.</p>
+
+<p>The guarantee of end-to-end exactly-once semantics is more tricky.
When we say “exactly-once semantics”, what we mean is that each incoming event
affects the final results exactly once. Even in the event of a machine or
software failure, there is no duplicate data and no data that goes unprocessed.
This is quite difficult because of the demands of message acknowledgment and
recovery during such fast processing and is also why some early distributed
streaming engines like Storm(w [...]
+
+<p>Flink is one of the first streaming systems that was able to provide
exactly-once semantics due to its delicate <a
href="https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink">checkpoint
mechanism</a>. But to make it work end-to-end, the final stage needs to
apply the semantic to external message system sinks that support commits and
rollbacks.</p>
+
+<p>To work around this problem, Pravega introduced <a
href="https://cncf.pravega.io/docs/latest/transactions/">transactional
writes</a>. A Pravega transaction allows an application to prepare a set
of events that can be written “all at once” to a Stream. This allows an
application to “commit” a bunch of events atomically. When writes are
idempotent, it is possible to implement end-to-end exactly-once pipelines
together with Flink.</p>
+
+<p>To build such an end-to-end solution requires coordination between
Flink and the Pravega sink, which is still challenging. A common approach for
coordinating commits and rollbacks in a distributed system is the two-phase
commit protocol. We used this protocol and, together with the Flink community,
implemented the sink function in a two-phase commit way coordinated with Flink
checkpoints.</p>
+
+<p>The Flink community then extracted the common logic from the
two-phase commit protocol and provided a general interface
<code>TwoPhaseCommitSinkFunction</code> (<a
href="https://issues.apache.org/jira/browse/FLINK-7210">FLINK-7210</a>)
to make it possible to build end-to-end exactly-once applications with other
message systems that have transaction support. This includes Apache Kafka
versions 0.11 and above. There is an official Flink <a href [...]
+
+<h1 id="summary">Summary</h1>
+<p>The Pravega Flink connector enables Pravega to connect to Flink and
allows Pravega to act as a key data store in a streaming pipeline. Both
projects share a common design philosophy and can integrate well with each
other. Pravega has its own concept of checkpointing and has implemented
transactional writes to support end-to-end exactly-once guarantees.</p>
+
+<h1 id="future-plans">Future plans</h1>
+
+<p><code>FlinkPravegaInputFormat</code> and
<code>FlinkPravegaOutputFormat</code> are now provided to support
batch reads and writes in Flink, but these are under the legacy DataSet API.
Since Flink is now making efforts to unify batch and streaming, it is improving
its APIs and providing new interfaces for the <a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface">source</a>
and <a href=&quo [...]
+
+<p>We will also put more effort into SQL / Table API support in order to
provide a better user experience since it is simpler to understand and even
more powerful to use in some cases.</p>
+
+<p><strong>Note:</strong> the original blog post can be
found <a
href="https://cncf.pravega.io/blog/2021/11/01/pravega-flink-connector-101/">here</a>.</p>
+</description>
+<pubDate>Thu, 20 Jan 2022 01:00:00 +0100</pubDate>
+<link>https://flink.apache.org/2022/01/20/pravega-connector-101.html</link>
+<guid isPermaLink="true">/2022/01/20/pravega-connector-101.html</guid>
+</item>
+
+<item>
<title>Apache Flink 1.14.3 Release Announcement</title>
<description><p>The Apache Flink community released the second bugfix
version of the Apache Flink 1.14 series.
The first bugfix release was 1.14.2, being an emergency release due to an
Apache Log4j Zero Day (CVE-2021-44228). Flink 1.14.1 was abandoned.
@@ -20026,102 +20283,5 @@ for a full reference of Flink’s metrics
system.</p>
<guid isPermaLink="true">/news/2019/02/25/monitoring-best-practices.html</guid>
</item>
-<item>
-<title>Apache Flink 1.6.4 Released</title>
-<description><p>The Apache Flink community released the fourth bugfix
version of the Apache Flink 1.6 series.</p>
-
-<p>This release includes more than 25 fixes and minor improvements for
Flink 1.6.3. The list below includes a detailed list of all fixes.</p>
-
-<p>We highly recommend all users to upgrade to Flink 1.6.4.</p>
-
-<p>Updated Maven dependencies:</p>
-
-<div class="highlight"><pre><code
class="language-xml"><span
class="nt">&lt;dependency&gt;</span>
- <span
class="nt">&lt;groupId&gt;</span>org.apache.flink<span
class="nt">&lt;/groupId&gt;</span>
- <span
class="nt">&lt;artifactId&gt;</span>flink-java<span
class="nt">&lt;/artifactId&gt;</span>
- <span
class="nt">&lt;version&gt;</span>1.6.4<span
class="nt">&lt;/version&gt;</span>
-<span class="nt">&lt;/dependency&gt;</span>
-<span class="nt">&lt;dependency&gt;</span>
- <span
class="nt">&lt;groupId&gt;</span>org.apache.flink<span
class="nt">&lt;/groupId&gt;</span>
- <span
class="nt">&lt;artifactId&gt;</span>flink-streaming-java_2.11<span
class="nt">&lt;/artifactId&gt;</span>
- <span
class="nt">&lt;version&gt;</span>1.6.4<span
class="nt">&lt;/version&gt;</span>
-<span class="nt">&lt;/dependency&gt;</span>
-<span class="nt">&lt;dependency&gt;</span>
- <span
class="nt">&lt;groupId&gt;</span>org.apache.flink<span
class="nt">&lt;/groupId&gt;</span>
- <span
class="nt">&lt;artifactId&gt;</span>flink-clients_2.11<span
class="nt">&lt;/artifactId&gt;</span>
- <span
class="nt">&lt;version&gt;</span>1.6.4<span
class="nt">&lt;/version&gt;</span>
-<span
class="nt">&lt;/dependency&gt;</span></code></pre></div>
-
-<p>You can find the binaries on the updated <a
href="/downloads.html">Downloads page</a>.</p>
-
-<p>List of resolved issues:</p>
-
-<h2> Bug
-</h2>
-<ul>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-10721">FLINK-10721</a>]
- Kafka discovery-loop exceptions may be swallowed
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-10761">FLINK-10761</a>]
- MetricGroup#getAllVariables can deadlock
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-10774">FLINK-10774</a>]
- connection leak when partition discovery is disabled and open throws
exception
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-10848">FLINK-10848</a>]
- Flink&#39;s Yarn ResourceManager can allocate too many excess
containers
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11022">FLINK-11022</a>]
- Update LICENSE and NOTICE files for older releases
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11071">FLINK-11071</a>]
- Dynamic proxy classes cannot be resolved when deserializing job graph
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11084">FLINK-11084</a>]
- Incorrect ouput after two consecutive split and select
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11119">FLINK-11119</a>]
- Incorrect Scala example for Table Function
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11134">FLINK-11134</a>]
- Invalid REST API request should not log the full exception in Flink
logs
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11151">FLINK-11151</a>]
- FileUploadHandler stops working if the upload directory is removed
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11173">FLINK-11173</a>]
- Proctime attribute validation throws an incorrect exception message
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11224">FLINK-11224</a>]
- Log is missing in scala-shell
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11232">FLINK-11232</a>]
- Empty Start Time of sub-task on web dashboard
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11234">FLINK-11234</a>]
- ExternalTableCatalogBuilder unable to build a batch-only table
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11235">FLINK-11235</a>]
- Elasticsearch connector leaks threads if no connection could be
established
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11251">FLINK-11251</a>]
- Incompatible metric name on prometheus reporter
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11389">FLINK-11389</a>]
- Incorrectly use job information when call
getSerializedTaskInformation in class TaskDeploymentDescriptor
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11584">FLINK-11584</a>]
- ConfigDocsCompletenessITCase fails DescriptionBuilder#linebreak() is
used
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11585">FLINK-11585</a>]
- Prefix matching in ConfigDocsGenerator can result in wrong
assignments
-</li>
-</ul>
-
-<h2> Improvement
-</h2>
-<ul>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-10910">FLINK-10910</a>]
- Harden Kubernetes e2e test
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11079">FLINK-11079</a>]
- Skip deployment for flnk-storm-examples
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11207">FLINK-11207</a>]
- Update Apache commons-compress from 1.4.1 to 1.18
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11262">FLINK-11262</a>]
- Bump jython-standalone to 2.7.1
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11289">FLINK-11289</a>]
- Rework example module structure to account for licensing
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11304">FLINK-11304</a>]
- Typo in time attributes doc
-</li>
-<li>[<a
href="https://issues.apache.org/jira/browse/FLINK-11469">FLINK-11469</a>]
- fix Tuning Checkpoints and Large State doc
-</li>
-</ul>
-</description>
-<pubDate>Mon, 25 Feb 2019 01:00:00 +0100</pubDate>
-<link>https://flink.apache.org/news/2019/02/25/release-1.6.4.html</link>
-<guid isPermaLink="true">/news/2019/02/25/release-1.6.4.html</guid>
-</item>
-
</channel>
</rss>
diff --git a/content/blog/index.html b/content/blog/index.html
index 92d6f2d..9c1226f 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector
101</a></h2>
+
+ <p>20 Jan 2022
+ Yumin Zhou (Brian) (<a
href="https://twitter.com/crazy__zhou">@crazy__zhou</a>)</p>
+
+ <p>A brief introduction to the Pravega Flink Connector</p>
+
+ <p><a href="/2022/01/20/pravega-connector-101.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release
Announcement</a></h2>
<p>17 Jan 2022
@@ -321,19 +334,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/2021/10/26/sort-shuffle-part1.html">Sort-Based Blocking Shuffle
Implementation in Flink - Part One</a></h2>
-
- <p>26 Oct 2021
- Yingjie Cao (Kevin) & Daisy Tsang </p>
-
- <p>Flink has implemented the sort-based blocking shuffle (FLIP-148) for
batch data processing. In this blog post, we will take a close look at the
design & implementation details and see what we can gain from it.</p>
-
- <p><a href="/2021/10/26/sort-shuffle-part1.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -366,6 +366,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html
index 880bc61..4c25b09 100644
--- a/content/blog/page10/index.html
+++ b/content/blog/page10/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A
Practical Guide to Broadcast State in Apache Flink</a></h2>
+
+ <p>26 Jun 2019
+ Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p>
+
+ <p>Apache Flink has multiple types of operator state, one of which is
called Broadcast State. In this post, we explain what Broadcast State is, and
show an example of how it can be applied to an application that evaluates
dynamic patterns on an event stream.</p>
+
+ <p><a href="/2019/06/26/broadcast-state.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A
Deep-Dive into Flink's Network Stack</a></h2>
<p>05 Jun 2019
@@ -325,21 +338,6 @@ for more details.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2>
-
- <p>25 Feb 2019
- </p>
-
- <p><p>The Apache Flink community released the fourth bugfix version of
the Apache Flink 1.6 series.</p>
-
-</p>
-
- <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -372,6 +370,16 @@ for more details.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html
index 05703d1..50cf1a2 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page11/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2>
+
+ <p>25 Feb 2019
+ </p>
+
+ <p><p>The Apache Flink community released the fourth bugfix version of
the Apache Flink 1.6 series.</p>
+
+</p>
+
+ <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2>
<p>15 Feb 2019
@@ -335,21 +350,6 @@ Please check the <a
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2>
-
- <p>20 Sep 2018
- </p>
-
- <p><p>The Apache Flink community released the fourth bugfix version of
the Apache Flink 1.5 series.</p>
-
-</p>
-
- <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -382,6 +382,16 @@ Please check the <a
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html
index 7ce300f..1049ee7 100644
--- a/content/blog/page12/index.html
+++ b/content/blog/page12/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2>
+
+ <p>20 Sep 2018
+ </p>
+
+ <p><p>The Apache Flink community released the fourth bugfix version of
the Apache Flink 1.5 series.</p>
+
+</p>
+
+ <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2>
<p>21 Aug 2018
@@ -333,19 +348,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State
in Apache Flink: An Intro to Incremental Checkpointing</a></h2>
-
- <p>30 Jan 2018
- Stefan Ricther (<a
href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward
(<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p>
-
- <p>Flink 1.3.0 introduced incremental checkpointing, making it possible
for applications with large state to generate checkpoints more efficiently.</p>
-
- <p><a
href="/features/2018/01/30/incremental-checkpointing.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -378,6 +380,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html
index 547f587..a6c6c11 100644
--- a/content/blog/page13/index.html
+++ b/content/blog/page13/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State
in Apache Flink: An Intro to Incremental Checkpointing</a></h2>
+
+ <p>30 Jan 2018
+ Stefan Ricther (<a
href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward
(<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p>
+
+ <p>Flink 1.3.0 introduced incremental checkpointing, making it possible
for applications with large state to generate checkpoints more efficiently.</p>
+
+ <p><a
href="/features/2018/01/30/incremental-checkpointing.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in
Review</a></h2>
<p>21 Dec 2017
@@ -336,20 +349,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what
the Flink community
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic
Tables</a></h2>
-
- <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang
- </p>
-
- <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs
for stream and batch processing, meaning that a query produces the same result
when being evaluated on streaming or static data.</p>
-<p>In this blog post we discuss the future of these APIs and introduce the
concept of Dynamic Tables. Dynamic tables will significantly expand the scope
of the Table API and SQL on streams and enable many more advanced use cases. We
discuss how streams and dynamic tables relate to each other and explain the
semantics of continuously evaluating queries on dynamic tables.</p></p>
-
- <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -382,6 +381,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what
the Flink community
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html
index 7cfd096..4d8bac0 100644
--- a/content/blog/page14/index.html
+++ b/content/blog/page14/index.html
@@ -201,6 +201,20 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic
Tables</a></h2>
+
+ <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang
+ </p>
+
+ <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs
for stream and batch processing, meaning that a query produces the same result
when being evaluated on streaming or static data.</p>
+<p>In this blog post we discuss the future of these APIs and introduce the
concept of Dynamic Tables. Dynamic tables will significantly expand the scope
of the Table API and SQL on streams and enable many more advanced use cases. We
discuss how streams and dynamic tables relate to each other and explain the
semantics of continuously evaluating queries on dynamic tables.</p></p>
+
+ <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and
Back Again: An Update on Flink's Table & SQL API</a></h2>
<p>29 Mar 2017 by Timo Walther (<a
href="https://twitter.com/">@twalthr</a>)
@@ -329,21 +343,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink
1.1.0</a></h2>
-
- <p>08 Aug 2016
- </p>
-
- <p><div class="alert alert-success"><strong>Important</strong>: The
Maven artifacts published with version 1.1.0 on Maven central have a Hadoop
dependency issue. It is highly recommended to use <strong>1.1.1</strong> or
<strong>1.1.1-hadoop1</strong> as the Flink version.</div>
-
-</p>
-
- <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -376,6 +375,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page15/index.html b/content/blog/page15/index.html
index 7b1d8dc..58a760a 100644
--- a/content/blog/page15/index.html
+++ b/content/blog/page15/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink
1.1.0</a></h2>
+
+ <p>08 Aug 2016
+ </p>
+
+ <p><div class="alert alert-success"><strong>Important</strong>: The
Maven artifacts published with version 1.1.0 on Maven central have a Hadoop
dependency issue. It is highly recommended to use <strong>1.1.1</strong> or
<strong>1.1.1-hadoop1</strong> as the Flink version.</div>
+
+</p>
+
+ <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream
Processing for Everyone with SQL and Apache Flink</a></h2>
<p>24 May 2016 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
@@ -330,19 +345,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache
Flink: How to run existing Storm topologies on Flink</a></h2>
-
- <p>11 Dec 2015 by Matthias J. Sax (<a
href="https://twitter.com/">@MatthiasJSax</a>)
- </p>
-
- <p>In this blog post, we describe Flink's compatibility package for <a
href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts
(sources) and Bolts (operators) in a regular Flink streaming job. Furthermore,
the compatibility package provides a Storm compatible API in order to execute
whole Storm topologies with (almost) no code adaption.</p>
-
- <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -375,6 +377,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page16/index.html b/content/blog/page16/index.html
index ea1de5d..bdec7b2 100644
--- a/content/blog/page16/index.html
+++ b/content/blog/page16/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache
Flink: How to run existing Storm topologies on Flink</a></h2>
+
+ <p>11 Dec 2015 by Matthias J. Sax (<a
href="https://twitter.com/">@MatthiasJSax</a>)
+ </p>
+
+ <p>In this blog post, we describe Flink's compatibility package for <a
href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts
(sources) and Bolts (operators) in a regular Flink streaming job. Furthermore,
the compatibility package provides a Storm compatible API in order to execute
whole Storm topologies with (almost) no code adaption.</p>
+
+ <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in
Apache Flink</a></h2>
<p>04 Dec 2015 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
@@ -338,20 +351,6 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits
and Bytes</a></h2>
-
- <p>11 May 2015 by Fabian Hüske (<a
href="https://twitter.com/">@fhueske</a>)
- </p>
-
- <p><p>Nowadays, a lot of open-source systems for analyzing large data
sets are implemented in Java or other JVM-based programming languages. The most
well-known example is Apache Hadoop, but also newer frameworks such as Apache
Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that
JVM-based data analysis engines face is to store large amounts of data in
memory - both for caching and for efficient processing such as sorting and
joining of data. Managing the [...]
-<p>In this blog post we discuss how Apache Flink manages memory, talk about
its custom data de/serialization stack, and show how it operates on binary
data.</p></p>
-
- <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -384,6 +383,16 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page17/index.html b/content/blog/page17/index.html
index e8411dc..aaf0bac 100644
--- a/content/blog/page17/index.html
+++ b/content/blog/page17/index.html
@@ -201,6 +201,20 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits
and Bytes</a></h2>
+
+ <p>11 May 2015 by Fabian Hüske (<a
href="https://twitter.com/">@fhueske</a>)
+ </p>
+
+ <p><p>Nowadays, a lot of open-source systems for analyzing large data
sets are implemented in Java or other JVM-based programming languages. The most
well-known example is Apache Hadoop, but also newer frameworks such as Apache
Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that
JVM-based data analysis engines face is to store large amounts of data in
memory - both for caching and for efficient processing such as sorting and
joining of data. Managing the [...]
+<p>In this blog post we discuss how Apache Flink manages memory, talk about
its custom data de/serialization stack, and show how it operates on binary
data.</p></p>
+
+ <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink
0.9.0-milestone1 preview release</a></h2>
<p>13 Apr 2015
@@ -345,21 +359,6 @@ and offers a new API including definition of flexible
windows.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2>
-
- <p>04 Nov 2014
- </p>
-
- <p><p>We are pleased to announce the availability of Flink 0.7.0. This
release includes new user-facing features as well as performance and bug fixes,
brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total
of 34 people have contributed to this release, a big thanks to all of them!</p>
-
-</p>
-
- <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -392,6 +391,16 @@ and offers a new API including definition of flexible
windows.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page18/index.html b/content/blog/page18/index.html
index 870d5b4..427ce18 100644
--- a/content/blog/page18/index.html
+++ b/content/blog/page18/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2>
+
+ <p>04 Nov 2014
+ </p>
+
+ <p><p>We are pleased to announce the availability of Flink 0.7.0. This
release includes new user-facing features as well as performance and bug fixes,
brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total
of 34 people have contributed to this release, a big thanks to all of them!</p>
+
+</p>
+
+ <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2>
<p>03 Oct 2014
@@ -280,6 +295,16 @@ academic and open source project that Flink originates
from.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html
index 407187c..cf244b9 100644
--- a/content/blog/page2/index.html
+++ b/content/blog/page2/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/2021/10/26/sort-shuffle-part1.html">Sort-Based Blocking Shuffle
Implementation in Flink - Part One</a></h2>
+
+ <p>26 Oct 2021
+ Yingjie Cao (Kevin) & Daisy Tsang </p>
+
+ <p>Flink has implemented the sort-based blocking shuffle (FLIP-148) for
batch data processing. In this blog post, we will take a close look at the
design & implementation details and see what we can gain from it.</p>
+
+ <p><a href="/2021/10/26/sort-shuffle-part1.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2021/10/19/release-1.13.3.html">Apache Flink 1.13.3
Released</a></h2>
<p>19 Oct 2021
@@ -341,19 +354,6 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<hr>
- <article>
- <h2 class="blog-title"><a href="/2021/07/07/backpressure.html">How to
identify the source of backpressure?</a></h2>
-
- <p>07 Jul 2021
- Piotr Nowojski (<a
href="https://twitter.com/PiotrNowojski">@PiotrNowojski</a>)</p>
-
- <p>Apache Flink 1.13 introduced a couple of important changes in the
area of backpressure monitoring and performance analysis of Flink Jobs. This
blog post aims to introduce those changes and explain how to use them.</p>
-
- <p><a href="/2021/07/07/backpressure.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -386,6 +386,16 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html
index 59dc953..fd2b672 100644
--- a/content/blog/page3/index.html
+++ b/content/blog/page3/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a href="/2021/07/07/backpressure.html">How to
identify the source of backpressure?</a></h2>
+
+ <p>07 Jul 2021
+ Piotr Nowojski (<a
href="https://twitter.com/PiotrNowojski">@PiotrNowojski</a>)</p>
+
+ <p>Apache Flink 1.13 introduced a couple of important changes in the
area of backpressure monitoring and performance analysis of Flink Jobs. This
blog post aims to introduce those changes and explain how to use them.</p>
+
+ <p><a href="/2021/07/07/backpressure.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1
Released</a></h2>
<p>28 May 2021
@@ -329,21 +342,6 @@ to develop scalable, consistent, and elastic distributed
applications.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3
Released</a></h2>
-
- <p>29 Jan 2021
- Xintong Song </p>
-
- <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.10 series.</p>
-
-</p>
-
- <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -376,6 +374,16 @@ to develop scalable, consistent, and elastic distributed
applications.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html
index 40b62c6..0d55ab1 100644
--- a/content/blog/page4/index.html
+++ b/content/blog/page4/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3
Released</a></h2>
+
+ <p>29 Jan 2021
+ Xintong Song </p>
+
+ <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.10 series.</p>
+
+</p>
+
+ <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2021/01/19/release-1.12.1.html">Apache Flink 1.12.1
Released</a></h2>
<p>19 Jan 2021
@@ -325,19 +340,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From
Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and
Backpressure</a></h2>
-
- <p>15 Oct 2020
- Arvid Heise & Stephan Ewen </p>
-
- <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of
its defining features. Because of that design, Flink unifies batch and stream
processing, can easily scale to both very small and extremely large scenarios
and provides support for many operational features. In this post we recap the
original checkpointing process in Flink, its core properties and issues under
backpressure.</p>
-
- <p><a
href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -370,6 +372,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html
index 626159e..d6e2af0 100644
--- a/content/blog/page5/index.html
+++ b/content/blog/page5/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From
Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and
Backpressure</a></h2>
+
+ <p>15 Oct 2020
+ Arvid Heise & Stephan Ewen </p>
+
+ <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of
its defining features. Because of that design, Flink unifies batch and stream
processing, can easily scale to both very small and extremely large scenarios
and provides support for many operational features. In this post we recap the
original checkpointing process in Flink, its core properties and issues under
backpressure.</p>
+
+ <p><a
href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2020/10/13/stateful-serverless-internals.html">Stateful Functions
Internals: Behind the scenes of Stateful Serverless</a></h2>
<p>13 Oct 2020
@@ -329,19 +342,6 @@ as well as increased observability for operational
purposes.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The
integration of Pandas into PyFlink</a></h2>
-
- <p>04 Aug 2020
- Jincheng Sun (<a
href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos
Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p>
-
- <p>The Apache Flink community put some great effort into integrating
Pandas with PyFlink in the latest Flink version 1.11. Some of the added
features include support for Pandas UDF and the conversion between Pandas
DataFrame and Table. In this article, we will introduce how these
functionalities work and how to use them with a step-by-step example.</p>
-
- <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -374,6 +374,16 @@ as well as increased observability for operational
purposes.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html
index 3098079..af18872 100644
--- a/content/blog/page6/index.html
+++ b/content/blog/page6/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The
integration of Pandas into PyFlink</a></h2>
+
+ <p>04 Aug 2020
+ Jincheng Sun (<a
href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos
Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p>
+
+ <p>The Apache Flink community put some great effort into integrating
Pandas with PyFlink in the latest Flink version 1.11. Some of the added
features include support for Pandas UDF and the conversion between Pandas
DataFrame and Table. In this article, we will introduce how these
functionalities work and how to use them with a step-by-step example.</p>
+
+ <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced Flink Application
Patterns Vol.3: Custom Window Processing</a></h2>
<p>30 Jul 2020
@@ -335,19 +348,6 @@ and provide a tutorial for running Streaming ETL with
Flink on Zeppelin.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2020/06/11/community-update.html">Flink Community Update -
June'20</a></h2>
-
- <p>11 Jun 2020
- Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
- <p>And suddenly it’s June. The previous month has been calm on the
surface, but quite hectic underneath — the final testing phase for Flink 1.11
is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink
has made it into Google Season of Docs 2020.</p>
-
- <p><a href="/news/2020/06/11/community-update.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -380,6 +380,16 @@ and provide a tutorial for running Streaming ETL with
Flink on Zeppelin.</p>
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html
index 24a11ad..96d9aa4 100644
--- a/content/blog/page7/index.html
+++ b/content/blog/page7/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2020/06/11/community-update.html">Flink Community Update -
June'20</a></h2>
+
+ <p>11 Jun 2020
+ Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+ <p>And suddenly it’s June. The previous month has been calm on the
surface, but quite hectic underneath — the final testing phase for Flink 1.11
is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink
has made it into Google Season of Docs 2020.</p>
+
+ <p><a href="/news/2020/06/11/community-update.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0
Release Announcement</a></h2>
<p>09 Jun 2020
@@ -326,19 +339,6 @@ This release marks a big milestone: Stateful Functions 2.0
is not only an API up
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2020/04/01/community-update.html">Flink Community Update -
April'20</a></h2>
-
- <p>01 Apr 2020
- Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
- <p>While things slow down around us, the Apache Flink community is
privileged to remain as active as ever. This blogpost combs through the past
few months to give you an update on the state of things in Flink — from core
releases to Stateful Functions; from some good old community stats to a new
development blog.</p>
-
- <p><a href="/news/2020/04/01/community-update.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -371,6 +371,16 @@ This release marks a big milestone: Stateful Functions 2.0
is not only an API up
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html
index 39bfecd..2229ac9 100644
--- a/content/blog/page8/index.html
+++ b/content/blog/page8/index.html
@@ -201,6 +201,19 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2020/04/01/community-update.html">Flink Community Update -
April'20</a></h2>
+
+ <p>01 Apr 2020
+ Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+ <p>While things slow down around us, the Apache Flink community is
privileged to remain as active as ever. This blogpost combs through the past
few months to give you an update on the state of things in Flink — from core
releases to Stateful Functions; from some good old community stats to a new
development blog.</p>
+
+ <p><a href="/news/2020/04/01/community-update.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified
Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2>
<p>27 Mar 2020
@@ -323,21 +336,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2>
-
- <p>11 Dec 2019
- Hequn Cheng </p>
-
- <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.8 series.</p>
-
-</p>
-
- <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -370,6 +368,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html
index fbf0c30..869cf9b 100644
--- a/content/blog/page9/index.html
+++ b/content/blog/page9/index.html
@@ -201,6 +201,21 @@
<!-- Blog posts -->
<article>
+ <h2 class="blog-title"><a
href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2>
+
+ <p>11 Dec 2019
+ Hequn Cheng </p>
+
+ <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.8 series.</p>
+
+</p>
+
+ <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
+ <article>
<h2 class="blog-title"><a
href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on
Kubernetes with KUDO</a></h2>
<p>09 Dec 2019
@@ -326,19 +341,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A
Practical Guide to Broadcast State in Apache Flink</a></h2>
-
- <p>26 Jun 2019
- Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p>
-
- <p>Apache Flink has multiple types of operator state, one of which is
called Broadcast State. In this post, we explain what Broadcast State is, and
show an example of how it can be applied to an application that evaluates
dynamic patterns on an event stream.</p>
-
- <p><a href="/2019/06/26/broadcast-state.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -371,6 +373,16 @@
<ul id="markdown-toc">
+ <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3
Release Announcement</a></li>
diff --git a/content/index.html b/content/index.html
index 20e1220..61e6e81 100644
--- a/content/index.html
+++ b/content/index.html
@@ -365,6 +365,9 @@
<dl>
+ <dt> <a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></dt>
+ <dd>A brief introduction to the Pravega Flink Connector</dd>
+
<dt> <a href="/news/2022/01/17/release-1.14.3.html">Apache Flink
1.14.3 Release Announcement</a></dt>
<dd>The Apache Flink community released the second bugfix version of
the Apache Flink 1.14 series.</dd>
@@ -376,11 +379,6 @@
<dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We
Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt>
<dd>To improve the performance of the scheduler for large-scale jobs,
several optimizations were introduced in Flink 1.13 and 1.14. In this blog post
we'll take a look at them.</dd>
-
- <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache
Flink StateFun Log4j emergency release</a></dt>
- <dd><p>The Apache Flink community has released an emergency bugfix
version of Apache Flink Stateful Function 3.1.1.</p>
-
-</dd>
</dl>
diff --git a/content/zh/index.html b/content/zh/index.html
index ad65b66..4e8ecc1 100644
--- a/content/zh/index.html
+++ b/content/zh/index.html
@@ -362,6 +362,9 @@
<dl>
+ <dt> <a href="/2022/01/20/pravega-connector-101.html">Pravega Flink
Connector 101</a></dt>
+ <dd>A brief introduction to the Pravega Flink Connector</dd>
+
<dt> <a href="/news/2022/01/17/release-1.14.3.html">Apache Flink
1.14.3 Release Announcement</a></dt>
<dd>The Apache Flink community released the second bugfix version of
the Apache Flink 1.14 series.</dd>
@@ -373,11 +376,6 @@
<dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We
Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt>
<dd>To improve the performance of the scheduler for large-scale jobs,
several optimizations were introduced in Flink 1.13 and 1.14. In this blog post
we'll take a look at them.</dd>
-
- <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache
Flink StateFun Log4j emergency release</a></dt>
- <dd><p>The Apache Flink community has released an emergency bugfix
version of Apache Flink Stateful Function 3.1.1.</p>
-
-</dd>
</dl>