This is an automated email from the ASF dual-hosted git repository.
rmetzger pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 8950f518e Rebuild website
8950f518e is described below
commit 8950f518ec585256934ea04bb224280cdc26f038
Author: Hong Liang Teoh <[email protected]>
AuthorDate: Fri Nov 25 09:10:07 2022 +0000
Rebuild website
---
.../11/25/async-sink-rate-limiting-strategy.html | 470 +++++++++++++++++++++
content/blog/feed.xml | 320 ++++++++------
content/blog/index.html | 36 +-
content/blog/page10/index.html | 40 +-
content/blog/page11/index.html | 38 +-
content/blog/page12/index.html | 36 +-
content/blog/page13/index.html | 38 +-
content/blog/page14/index.html | 40 +-
content/blog/page15/index.html | 38 +-
content/blog/page16/index.html | 36 +-
content/blog/page17/index.html | 37 +-
content/blog/page18/index.html | 38 +-
content/blog/page19/index.html | 44 +-
content/blog/page2/index.html | 36 +-
content/blog/page20/index.html | 45 +-
content/blog/page21/index.html | 25 ++
content/blog/page3/index.html | 36 +-
content/blog/page4/index.html | 38 +-
content/blog/page5/index.html | 40 +-
content/blog/page6/index.html | 40 +-
content/blog/page7/index.html | 38 +-
content/blog/page8/index.html | 36 +-
content/blog/page9/index.html | 38 +-
content/index.html | 8 +-
content/zh/index.html | 8 +-
25 files changed, 1178 insertions(+), 421 deletions(-)
diff --git a/content/2022/11/25/async-sink-rate-limiting-strategy.html
b/content/2022/11/25/async-sink-rate-limiting-strategy.html
new file mode 100644
index 000000000..833918676
--- /dev/null
+++ b/content/2022/11/25/async-sink-rate-limiting-strategy.html
@@ -0,0 +1,470 @@
+<!DOCTYPE html>
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+ <!-- The above 3 meta tags *must* come first in the head; any other head
content must come *after* these tags -->
+ <title>Apache Flink: Optimising the throughput of async sinks using a
custom RateLimitingStrategy</title>
+ <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+ <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+ <!-- Bootstrap -->
+ <link rel="stylesheet" href="/css/bootstrap.min.css">
+ <link rel="stylesheet" href="/css/flink.css">
+ <link rel="stylesheet" href="/css/syntax.css">
+
+ <!-- Blog RSS feed -->
+ <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml"
title="Apache Flink Blog: RSS feed" />
+
+ <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+ <!-- We need to load Jquery in the header for custom google analytics
event tracking-->
+ <script src="/js/jquery.min.js"></script>
+
+ <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media
queries -->
+ <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+ <!--[if lt IE 9]>
+ <script
src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
+ <script
src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
+ <![endif]-->
+ <!-- Matomo -->
+ <script>
+ var _paq = window._paq = window._paq || [];
+ /* tracker methods like "setCustomDimension" should be called before
"trackPageView" */
+ /* We explicitly disable cookie tracking to avoid privacy issues */
+ _paq.push(['disableCookies']);
+ /* Measure a visit to flink.apache.org and nightlies.apache.org/flink as
the same visit */
+ _paq.push(["setDomains",
["*.flink.apache.org","*.nightlies.apache.org/flink"]]);
+ _paq.push(['trackPageView']);
+ _paq.push(['enableLinkTracking']);
+ (function() {
+ var u="//matomo.privacy.apache.org/";
+ _paq.push(['setTrackerUrl', u+'matomo.php']);
+ _paq.push(['setSiteId', '1']);
+ var d=document, g=d.createElement('script'),
s=d.getElementsByTagName('script')[0];
+ g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
+ })();
+ </script>
+ <!-- End Matomo Code -->
+ </head>
+ <body>
+
+
+ <!-- Main content. -->
+ <div class="container">
+ <div class="row">
+
+
+ <div id="sidebar" class="col-sm-3">
+
+
+<!-- Top navbar. -->
+ <nav class="navbar navbar-default">
+ <!-- The logo. -->
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle collapsed"
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <div class="navbar-logo">
+ <a href="/">
+ <img alt="Apache Flink" src="/img/flink-header-logo.svg"
width="147px" height="73px">
+ </a>
+ </div>
+ </div><!-- /.navbar-header -->
+
+ <!-- The navigation links. -->
+ <div class="collapse navbar-collapse"
id="bs-example-navbar-collapse-1">
+ <ul class="nav navbar-nav navbar-main">
+
+ <!-- First menu section explains visitors what Flink is -->
+
+ <!-- What is Stream Processing? -->
+ <!--
+ <li><a href="/streamprocessing1.html">What is Stream
Processing?</a></li>
+ -->
+
+ <!-- What is Flink? -->
+ <li><a href="/flink-architecture.html">What is Apache
Flink?</a></li>
+
+
+
+ <!-- Stateful Functions? -->
+
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-stable/">What is
Stateful Functions?</a></li>
+
+ <!-- Flink ML? -->
+
+ <li><a
href="https://nightlies.apache.org/flink/flink-ml-docs-stable/">What is Flink
ML?</a></li>
+
+ <!-- Flink Kubernetes Operator? -->
+
+ <li><a
href="https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-stable/">What
is the Flink Kubernetes Operator?</a></li>
+
+ <!-- Flink Table Store? -->
+
+ <li><a
href="https://nightlies.apache.org/flink/flink-table-store-docs-stable/">What
is Flink Table Store?</a></li>
+
+ <!-- Use cases -->
+ <li><a href="/usecases.html">Use Cases</a></li>
+
+ <!-- Powered by -->
+ <li><a href="/poweredby.html">Powered By</a></li>
+
+
+
+ <!-- Second menu section aims to support Flink users -->
+
+ <!-- Downloads -->
+ <li><a href="/downloads.html">Downloads</a></li>
+
+ <!-- Getting Started -->
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Getting Started<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-release-1.16//docs/try-flink/local_installation/"
target="_blank">With Flink <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/getting-started/project-setup.html"
target="_blank">With Flink Stateful Functions <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-ml-docs-release-2.1/try-flink-ml/quick-start.html"
target="_blank">With Flink ML <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.2/try-flink-kubernetes-operator/quick-start.html"
target="_blank">With Flink Kubernetes Operator <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/try-table-store/quick-start.html"
target="_blank">With Flink Table Store <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a href="/training.html">Training Course</a></li>
+ </ul>
+ </li>
+
+ <!-- Documentation -->
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Documentation<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-release-1.16"
target="_blank">Flink 1.16 (Latest stable release) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-docs-master"
target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2"
target="_blank">Flink Stateful Functions 3.2 (Latest stable release)
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-statefun-docs-master"
target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-ml-docs-release-2.1"
target="_blank">Flink ML 2.1 (Latest stable release) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-ml-docs-master"
target="_blank">Flink ML Master (Latest Snapshot) <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.2"
target="_blank">Flink Kubernetes Operator 1.2 (Latest stable release)
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main"
target="_blank">Flink Kubernetes Operator Main (Latest Snapshot) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2"
target="_blank">Flink Table Store 0.2 (Latest stable release) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ <li><a
href="https://nightlies.apache.org/flink/flink-table-store-docs-master"
target="_blank">Flink Table Store Master (Latest Snapshot) <small><span
class="glyphicon glyphicon-new-window"></span></small></a></li>
+ </ul>
+ </li>
+
+ <!-- getting help -->
+ <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+ <!-- Blog -->
+ <li><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+ <!-- Flink-packages -->
+ <li>
+ <a href="https://flink-packages.org"
target="_blank">flink-packages.org <small><span class="glyphicon
glyphicon-new-window"></span></small></a>
+ </li>
+
+
+ <!-- Third menu section aim to support community and contributors
-->
+
+ <!-- Community -->
+ <li><a href="/community.html">Community & Project Info</a></li>
+
+ <!-- Roadmap -->
+ <li><a href="/roadmap.html">Roadmap</a></li>
+
+ <!-- Contribute -->
+ <li><a href="/contributing/how-to-contribute.html">How to
Contribute</a></li>
+
+
+ <!-- GitHub -->
+ <li>
+ <a href="https://github.com/apache/flink" target="_blank">Flink
on GitHub <small><span class="glyphicon
glyphicon-new-window"></span></small></a>
+ </li>
+
+
+
+ <!-- Language Switcher -->
+ <li>
+
+
+ <a
href="/zh/2022/11/25/async-sink-rate-limiting-strategy.html">中文版</a>
+
+
+ </li>
+
+ </ul>
+
+ <style>
+ .smalllinks:link {
+ display: inline-block !important; background: none; padding-top:
0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+ }
+ </style>
+
+ <ul class="nav navbar-nav navbar-bottom">
+ <hr />
+
+ <!-- Twitter -->
+ <li><a href="https://twitter.com/apacheflink"
target="_blank">@ApacheFlink <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <!-- Visualizer -->
+ <li class=" hidden-md hidden-sm"><a href="/visualizer/"
target="_blank">Plan Visualizer <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <li >
+ <a href="/security.html">Flink Security</a>
+ </li>
+
+ <hr />
+
+ <li><a href="https://apache.org" target="_blank">Apache Software
Foundation <small><span class="glyphicon
glyphicon-new-window"></span></small></a></li>
+
+ <li>
+
+ <a class="smalllinks" href="https://www.apache.org/licenses/"
target="_blank">License</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks" href="https://www.apache.org/security/"
target="_blank">Security</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks"
href="https://www.apache.org/foundation/sponsorship.html"
target="_blank">Donate</a> <small><span class="glyphicon
glyphicon-new-window"></span></small>
+
+ <a class="smalllinks"
href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a>
<small><span class="glyphicon glyphicon-new-window"></span></small>
+ </li>
+
+ </ul>
+ </div><!-- /.navbar-collapse -->
+ </nav>
+
+ </div>
+ <div class="col-sm-9">
+ <div class="row-fluid">
+ <div class="col-sm-12">
+ <div class="row">
+ <h1>Optimising the throughput of async sinks using a custom
RateLimitingStrategy</h1>
+ <p><i></i></p>
+
+ <article>
+ <p>25 Nov 2022 Hong Liang Teoh </p>
+
+<h2 id="introduction">Introduction</h2>
+
+<p>When designing a Flink data processing job, one of the key concerns is
maximising job throughput. Sink throughput is a crucial factor because it can
determine the entire job’s throughput. We generally want the highest possible
write rate in the sink without overloading the destination. However, since the
factors impacting a destination’s performance are variable over the job’s
lifetime, the sink needs to adjust its write rate dynamically. Depending on the
sink’s destination, it helps [...]
+
+<p><strong>This post explains how you can optimise sink throughput by
configuring a custom RateLimitingStrategy on a connector that builds on
the</strong> <a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink"><strong>AsyncSinkBase
(FLIP-171)</strong></a><strong>.</strong> In the sections below, we cover the
design logic behind the AsyncSinkBase and the RateLimitingStrategy, then we
take you through two example implementations of rate limiting strategies, spec
[...]
+
+<h3 id="background-of-the-asyncsinkbase">Background of the AsyncSinkBase</h3>
+
+<p>When implementing the AsyncSinkBase, our goal was to simplify building new
async sinks to custom destinations by providing common async sink functionality
used with at least once processing. This has allowed users to more easily write
sinks to custom destinations, such as Amazon Kinesis Data Streams and Amazon
Kinesis Firehose. An additional async sink to Amazon DynamoDB (<a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector">FLIP-252</a
[...]
+
+<p>The AsyncSinkBase provides the core implementation which handles the
mechanics of async requests and responses. This includes retrying failed
messages, deciding when to flush records to the destination, and persisting
un-flushed records to state during checkpointing. In order to increase
throughput, the async sink also dynamically adjusts the request rate depending
on the destination’s responses. Read more about this in our <a
href="https://flink.apache.org/2022/05/06/async-sink-base. [...]
+
+<h3 id="configuring-the-asyncsinkbase">Configuring the AsyncSinkBase</h3>
+
+<p>When designing the AsyncSinkBase, we wanted users to be able to tune their
custom connector implementations based on their use case and needs, without
having to understand the low-level workings of the base sink itself.</p>
+
+<p>So, as part of our initial implementation in Flink 1.15, we exposed
configurations such as <code>maxBatchSize</code>,
<code>maxInFlightRequests</code>, <code>maxBufferedRequests</code>,
<code>maxBatchSizeInBytes</code>, <code>maxTimeInBufferMS</code> and
<code>maxRecordSizeInBytes</code> so that users can adapt the flushing and
writing behaviour of the sink.</p>
+
+<p>In Flink 1.16, we have further extended this configurability to the
RateLimitingStrategy used by the AsyncSinkBase (<a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+RateLimitingStrategy+for+Async+Sink">FLIP-242</a>).
With this change, users can now customise how the AsyncSinkBase dynamically
adjusts the request rate in real-time to optimise throughput whilst mitigating
back pressure. Example customisations include changing the mathematical [...]
+
+<h2 id="rationale-behind-the-ratelimitingstrategy-interface">Rationale behind
the RateLimitingStrategy interface</h2>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">public</span> <span class="kd">interface</span> <span
class="nc">RateLimitingStrategy</span> <span class="o">{</span>
+
+ <span class="c1">// Information provided to the RateLimitingStrategy</span>
+ <span class="kt">void</span> <span
class="nf">registerInFlightRequest</span><span class="o">(</span><span
class="n">RequestInfo</span> <span class="n">requestInfo</span><span
class="o">);</span>
+ <span class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span class="o">(</span><span
class="n">ResultInfo</span> <span class="n">resultInfo</span><span
class="o">);</span>
+
+ <span class="c1">// Controls offered to the RateLimitingStrategy</span>
+ <span class="kt">boolean</span> <span class="nf">shouldBlock</span><span
class="o">(</span><span class="n">RequestInfo</span> <span
class="n">requestInfo</span><span class="o">);</span>
+ <span class="kt">int</span> <span class="nf">getMaxBatchSize</span><span
class="o">();</span>
+
+<span class="o">}</span></code></pre></div>
+
+<p>There are 2 core ideas behind the RateLimitingStrategy interface:</p>
+
+<ul>
+ <li><strong>Information methods:</strong> We need methods to provide the
RateLimitingStrategy with sufficient information to track the rate of requests
or rate of sent messages (each request can comprise multiple messages)</li>
+ <li><strong>Control methods:</strong> We also need methods to allow the
RateLimitingStrategy to control the sink’s request rate.</li>
+</ul>
+
+<p>These are the type of methods that we see in the RateLimitingStrategy
interface. With <code>registerInFlightRequest()</code> and
<code>registerCompletedRequest()</code>, the RateLimitingStrategy has
sufficient information to track the number in-flight requests and messages, as
well as the rate of these requests.</p>
+
+<p>With <code>shouldBlock()</code>, the RateLimitingStrategy can decide to
postpone new requests until a specified condition is met (e.g. current
in-flight requests must not exceed a given number). This allows the
RateLimitingStrategy to control the rate of requests to the destination. It can
decide to increase throughput or to increase backpressure in the Flink job
graph.</p>
+
+<p>With <code>getMaxBatchSize()</code>, the RateLimitingStrategy can
dynamically adjust the number of messages packaged into a single request. This
can be useful to optimise sink throughput if the request size affects the
destination’s performance.</p>
+
+<h2 id="implementing-a-custom-ratelimitingstrategy">Implementing a custom
RateLimitingStrategy</h2>
+
+<h3 id="example-1--congestioncontrolratelimitingstrategy">[Example 1]
CongestionControlRateLimitingStrategy</h3>
+
+<p>The AsyncSinkBase comes pre-packaged with the
CongestionControlRateLimitingStrategy. In this section, we explore its
implementation.</p>
+
+<p>This strategy is modelled after <a
href="https://en.wikipedia.org/wiki/TCP_congestion_control">TCP congestion
control</a>, and aims to discover a destination’s highest possible request
rate. It achieves this by increasing the request rate until the sink is
throttled by the destination, at which point it will reduce the request
rate.</p>
+
+<p>In this RateLimitingStrategy, we want to dynamically adjust the request
rate by:</p>
+
+<ul>
+ <li>Setting a maximum number of in-flight requests at any time</li>
+ <li>Setting a maximum number of in-flight messages at any time (each request
can comprise multiple messages)</li>
+ <li>Increasing the maximum number of in-flight messages after each
successful request, to maximise the request rate</li>
+ <li>Decreasing the maximum number of in-flight messages after an
unsuccessful request, to prevent overloading the destination</li>
+ <li>Independently keeping track of the maximum number of in-flight messages
if there are multiple sink subtasks</li>
+</ul>
+
+<p>This strategy means we will start with a low request rate (slow start), but
aggressively increase it until the destination throttles us, which allows us to
discover the highest possible request rate. It will also adjust the request
rate if the conditions of the destination changes (e.g. another client starts
writing to the same destination). This strategy works well if destinations
implement traffic shaping and throttles once the bandwidth limit is reached
(e.g. Amazon Kinesis Data St [...]
+
+<p>First, we implement the information methods to keep track of the number of
in-flight requests and in-flight messages.</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">public</span> <span class="kd">class</span> <span
class="nc">CongestionControlRateLimitingStrategy</span> <span
class="kd">implements</span> <span class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span class="kt">void</span> <span
class="nf">registerInFlightRequest</span><span class="o">(</span><span
class="n">RequestInfo</span> <span class="n">requestInfo</span><span
class="o">)</span> <span class="o">{</span>
+ <span class="n">currentInFlightRequests</span><span
class="o">++;</span>
+ <span class="n">currentInFlightMessages</span> <span
class="o">+=</span> <span class="n">requestInfo</span><span
class="o">.</span><span class="na">getBatchSize</span><span class="o">();</span>
+ <span class="o">}</span>
+
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span class="o">(</span><span
class="n">ResultInfo</span> <span class="n">resultInfo</span><span
class="o">)</span> <span class="o">{</span>
+ <span class="n">currentInFlightRequests</span> <span
class="o">=</span> <span class="n">Math</span><span class="o">.</span><span
class="na">max</span><span class="o">(</span><span class="mi">0</span><span
class="o">,</span> <span class="n">currentInFlightRequests</span> <span
class="o">-</span> <span class="mi">1</span><span class="o">);</span>
+ <span class="n">currentInFlightMessages</span> <span
class="o">=</span> <span class="n">Math</span><span class="o">.</span><span
class="na">max</span><span class="o">(</span><span class="mi">0</span><span
class="o">,</span> <span class="n">currentInFlightMessages</span> <span
class="o">-</span> <span class="n">resultInfo</span><span
class="o">.</span><span class="na">getBatchSize</span><span
class="o">());</span>
+
+ <span class="k">if</span> <span class="o">(</span><span
class="n">resultInfo</span><span class="o">.</span><span
class="na">getFailedMessages</span><span class="o">()</span> <span
class="o">></span> <span class="mi">0</span><span class="o">)</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span> <span
class="o">=</span> <span class="n">scalingStrategy</span><span
class="o">.</span><span class="na">scaleDown</span><span
class="o">(</span><span class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span> <span class="k">else</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span> <span
class="o">=</span> <span class="n">scalingStrategy</span><span
class="o">.</span><span class="na">scaleUp</span><span class="o">(</span><span
class="n">maxInFlightMessages</span><span class="o">);</span>
+ <span class="o">}</span>
+ <span class="o">}</span>
+ <span class="c1">// ...</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Then we implement the control methods to dynamically adjust the request
rate.</p>
+
+<p>We keep a current value for maxInFlightMessages and maxInFlightRequests,
and postpone all new requests if maxInFlightRequests or maxInFlightMessages
have been reached.</p>
+
+<p>Every time a request completes, the CongestionControlRateLimitingStrategy
will check if there are any failed messages in the response. If there are, it
will decrease maxInFlightMessages. If there are no failed messages, it will
increase maxInFlightMessages. This gives us indirect control of the rate of
messages being written to the destination.</p>
+
+<p>Side note: The default CongestionControlRateLimitingStrategy uses an
Additive Increase / Multiplicative Decrease (AIMD) scaling strategy. This is
also used in TCP congestion control to avoid overloading the destination by
increasing write rate slowly but backing off quickly if throttled.</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">public</span> <span class="kd">class</span> <span
class="nc">CongestionControlRateLimitingStrategy</span> <span
class="kd">implements</span> <span class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span class="o">(</span><span
class="n">ResultInfo</span> <span class="n">resultInfo</span><span
class="o">)</span> <span class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="k">if</span> <span class="o">(</span><span
class="n">resultInfo</span><span class="o">.</span><span
class="na">getFailedMessages</span><span class="o">()</span> <span
class="o">></span> <span class="mi">0</span><span class="o">)</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span> <span
class="o">=</span> <span class="n">scalingStrategy</span><span
class="o">.</span><span class="na">scaleDown</span><span
class="o">(</span><span class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span> <span class="k">else</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span> <span
class="o">=</span> <span class="n">scalingStrategy</span><span
class="o">.</span><span class="na">scaleUp</span><span class="o">(</span><span
class="n">maxInFlightMessages</span><span class="o">);</span>
+ <span class="o">}</span>
+ <span class="o">}</span>
+
+ <span class="kd">public</span> <span class="kt">boolean</span> <span
class="nf">shouldBlock</span><span class="o">(</span><span
class="n">RequestInfo</span> <span class="n">requestInfo</span><span
class="o">)</span> <span class="o">{</span>
+ <span class="k">return</span> <span
class="n">currentInFlightRequests</span> <span class="o">>=</span> <span
class="n">maxInFlightRequests</span>
+ <span class="o">||</span> <span class="o">(</span><span
class="n">currentInFlightMessages</span> <span class="o">+</span> <span
class="n">requestInfo</span><span class="o">.</span><span
class="na">getBatchSize</span><span class="o">()</span> <span
class="o">></span> <span class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span>
+ <span class="c1">// ...</span>
+<span class="o">}</span></code></pre></div>
+
+<h3 id="example-2-tokenbucketratelimitingstrategy">[Example 2]
TokenBucketRateLimitingStrategy</h3>
+
+<p>The CongestionControlRateLimitingStrategy is rather aggressive, and relies
on a robust server-side rate limiting strategy. In the event we don’t have a
robust server-side rate limiting strategy, we can implement a client-side rate
limiting strategy.</p>
+
+<p>As an example, we can look at the <a
href="https://en.wikipedia.org/wiki/Token_bucket">token bucket rate limiting
strategy</a>. This strategy allows us to set the exact rate of the sink (e.g.
requests per second, messages per second). If the limits are set correctly, we
will avoid overloading the destination altogether.</p>
+
+<p>In this strategy, we want to do the following:</p>
+
+<ul>
+ <li>Implement a TokenBucket that has a given initial number of tokens (e.g.
10). These tokens refill at a given rate (e.g. 1 token per second).</li>
+ <li>When preparing an async request, we check if the token bucket has
sufficient tokens. If not, we postpone the request.</li>
+</ul>
+
+<p>Let’s look at an example implementation:</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">public</span> <span class="kd">class</span> <span
class="nc">TokenBucketRateLimitingStrategy</span> <span
class="kd">implements</span> <span class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+
+ <span class="kd">private</span> <span class="kd">final</span> <span
class="n">Bucket</span> <span class="n">bucket</span><span class="o">;</span>
+
+ <span class="kd">public</span> <span
class="nf">TokenBucketRateLimitingStrategy</span><span class="o">()</span>
<span class="o">{</span>
+ <span class="n">Refill</span> <span class="n">refill</span> <span
class="o">=</span> <span class="n">Refill</span><span class="o">.</span><span
class="na">intervally</span><span class="o">(</span><span
class="mi">1</span><span class="o">,</span> <span
class="n">Duration</span><span class="o">.</span><span
class="na">ofSeconds</span><span class="o">(</span><span
class="mi">1</span><span class="o">));</span>
+ <span class="n">Bandwidth</span> <span class="n">limit</span> <span
class="o">=</span> <span class="n">Bandwidth</span><span
class="o">.</span><span class="na">classic</span><span class="o">(</span><span
class="mi">10</span><span class="o">,</span> <span class="n">refill</span><span
class="o">);</span>
+ <span class="k">this</span><span class="o">.</span><span
class="na">bucket</span> <span class="o">=</span> <span
class="n">Bucket4j</span><span class="o">.</span><span
class="na">builder</span><span class="o">()</span>
+ <span class="o">.</span><span class="na">addLimit</span><span
class="o">(</span><span class="n">limit</span><span class="o">)</span>
+ <span class="o">.</span><span class="na">build</span><span
class="o">();</span>
+ <span class="o">}</span>
+
+ <span class="c1">// ... (information methods not needed)</span>
+
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span class="kt">boolean</span> <span
class="nf">shouldBlock</span><span class="o">(</span><span
class="n">RequestInfo</span> <span class="n">requestInfo</span><span
class="o">)</span> <span class="o">{</span>
+ <span class="k">return</span> <span class="n">bucket</span><span
class="o">.</span><span class="na">tryConsume</span><span
class="o">(</span><span class="n">requestInfo</span><span
class="o">.</span><span class="na">getBatchSize</span><span
class="o">());</span>
+ <span class="o">}</span>
+
+<span class="o">}</span></code></pre></div>
+
+<p>In the above example, we use the <a
href="https://github.com/bucket4j/bucket4j">Bucket4j</a> library’s Token Bucket
implementation. We also map 1 message to 1 token. Since our token bucket has a
size of 10 tokens and a refill rate of 1 token per second, we can be sure that
we will not exceed a burst of 10 messages, and will also not exceed a constant
rate of 1 message per second.</p>
+
+<p>This would be useful if we know that our destination will failover
ungracefully if a rate of 1 message per second is exceeded, or if we
intentionally want to limit our sink’s throughput to provide higher bandwidth
for other clients writing to the same destination.</p>
+
+<h2 id="specifying-a-custom-ratelimitingstrategy">Specifying a custom
RateLimitingStrategy</h2>
+
+<p>To specify a custom RateLimitingStrategy, we have to specify it in the
AsyncSinkWriterConfiguration which is passed into the constructor of the
AsyncSinkWriter. For example:</p>
+
+<div class="highlight"><pre><code class="language-java"><span
class="kd">class</span> <span class="nc">MyCustomSinkWriter</span><span
class="o"><</span><span class="n">InputT</span><span class="o">></span>
<span class="kd">extends</span> <span class="n">AsyncSinkWriter</span><span
class="o"><</span><span class="n">InputT</span><span class="o">,</span>
<span class="n">MyCustomRequestEntry</span><span class="o">></span> <span
class="o">{</span>
+
+ <span class="n">MyCustomSinkWriter</span><span class="o">(</span>
+ <span class="n">ElementConverter</span><span
class="o"><</span><span class="n">InputT</span><span class="o">,</span>
<span class="n">MyCustomRequestEntry</span><span class="o">></span> <span
class="n">elementConverter</span><span class="o">,</span>
+ <span class="n">Sink</span><span class="o">.</span><span
class="na">InitContext</span> <span class="n">context</span><span
class="o">,</span>
+ <span class="n">Collection</span><span class="o"><</span><span
class="n">BufferedRequestState</span><span class="o"><</span><span
class="n">MyCustomRequestEntry</span><span class="o">>></span> <span
class="n">states</span><span class="o">)</span> <span class="o">{</span>
+ <span class="kd">super</span><span class="o">(</span>
+ <span class="n">elementConverter</span><span class="o">,</span>
+ <span class="n">context</span><span class="o">,</span>
+ <span class="n">AsyncSinkWriterConfiguration</span><span
class="o">.</span><span class="na">builder</span><span class="o">()</span>
+ <span class="c1">// ...</span>
+ <span class="o">.</span><span
class="na">setRateLimitingStrategy</span><span class="o">(</span><span
class="k">new</span> <span
class="nf">TokenBucketRateLimitingStrategy</span><span class="o">())</span>
+ <span class="o">.</span><span class="na">build</span><span
class="o">(),</span>
+ <span class="n">states</span><span class="o">);</span>
+ <span class="o">}</span>
+
+<span class="o">}</span></code></pre></div>
+
+<h2 id="summary">Summary</h2>
+
+<p>From Apache Flink 1.16 we can customise the RateLimitingStrategy used to
dynamically adjust the behaviour of the Async Sink at runtime. This allows
users to tune their connector implementations based on specific use cases and
needs, without having to understand the base sink’s low-level workings.</p>
+
+<p>We hope this extension will be useful for you. If you have any feedback,
feel free to reach out!</p>
+
+
+ </article>
+ </div>
+
+ <div class="row">
+ <div id="disqus_thread"></div>
+ <script type="text/javascript">
+ /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE
* * */
+ var disqus_shortname = 'stratosphere-eu'; // required: replace example
with your forum shortname
+
+ /* * * DON'T EDIT BELOW THIS LINE * * */
+ (function() {
+ var dsq = document.createElement('script'); dsq.type =
'text/javascript'; dsq.async = true;
+ dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+ (document.getElementsByTagName('head')[0] ||
document.getElementsByTagName('body')[0]).appendChild(dsq);
+ })();
+ </script>
+ </div>
+ </div>
+</div>
+ </div>
+ </div>
+
+ <hr />
+
+ <div class="row">
+ <div class="footer text-center col-sm-12">
+ <p>Copyright © 2014-2022 <a href="http://apache.org">The Apache
Software Foundation</a>. All Rights Reserved.</p>
+ <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache
feather logo are either registered trademarks or trademarks of The Apache
Software Foundation.</p>
+ <p><a
href="https://privacy.apache.org/policies/privacy-policy-public.html">Privacy
Policy</a> · <a href="/blog/feed.xml">RSS feed</a></p>
+ </div>
+ </div>
+ </div><!-- /.container -->
+
+ <!-- Include all compiled plugins (below), or include individual files as
needed -->
+ <script src="/js/jquery.matchHeight-min.js"></script>
+ <script src="/js/bootstrap.min.js"></script>
+ <script src="/js/codetabs.js"></script>
+ <script src="/js/stickysidebar.js"></script>
+ </body>
+</html>
diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 081697646..7ea5b9629 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -6,6 +6,200 @@
<link>https://flink.apache.org/blog</link>
<atom:link href="https://flink.apache.org/blog/feed.xml" rel="self"
type="application/rss+xml" />
+<item>
+<title>Optimising the throughput of async sinks using a custom
RateLimitingStrategy</title>
+<description><h2 id="introduction">Introduction</h2>
+
+<p>When designing a Flink data processing job, one of the key concerns
is maximising job throughput. Sink throughput is a crucial factor because it
can determine the entire job’s throughput. We generally want the highest
possible write rate in the sink without overloading the destination. However,
since the factors impacting a destination’s performance are variable over the
job’s lifetime, the sink needs to adjust its write rate dynamically. Depending
on the sink’s destination, it [...]
+
+<p><strong>This post explains how you can optimise sink throughput
by configuring a custom RateLimitingStrategy on a connector that builds on
the</strong> <a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink"><strong>AsyncSinkBase
(FLIP-171)</strong></a><strong>.</strong> In the
sections below, we cover the design logic behind the AsyncSinkBase and the
RateLimitingStrategy, then we take you throu [...]
+
+<h3 id="background-of-the-asyncsinkbase">Background of the
AsyncSinkBase</h3>
+
+<p>When implementing the AsyncSinkBase, our goal was to simplify
building new async sinks to custom destinations by providing common async sink
functionality used with at least once processing. This has allowed users to
more easily write sinks to custom destinations, such as Amazon Kinesis Data
Streams and Amazon Kinesis Firehose. An additional async sink to Amazon
DynamoDB (<a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connecto
[...]
+
+<p>The AsyncSinkBase provides the core implementation which handles the
mechanics of async requests and responses. This includes retrying failed
messages, deciding when to flush records to the destination, and persisting
un-flushed records to state during checkpointing. In order to increase
throughput, the async sink also dynamically adjusts the request rate depending
on the destination’s responses. Read more about this in our <a
href="https://flink.apache.org/2022/05/06/as [...]
+
+<h3 id="configuring-the-asyncsinkbase">Configuring the
AsyncSinkBase</h3>
+
+<p>When designing the AsyncSinkBase, we wanted users to be able to tune
their custom connector implementations based on their use case and needs,
without having to understand the low-level workings of the base sink
itself.</p>
+
+<p>So, as part of our initial implementation in Flink 1.15, we exposed
configurations such as <code>maxBatchSize</code>,
<code>maxInFlightRequests</code>,
<code>maxBufferedRequests</code>,
<code>maxBatchSizeInBytes</code>,
<code>maxTimeInBufferMS</code> and
<code>maxRecordSizeInBytes</code> so that users can adapt the
flushing and writing behaviour of the sink.</p>
+
+<p>In Flink 1.16, we have further extended this configurability to the
RateLimitingStrategy used by the AsyncSinkBase (<a
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+RateLimitingStrategy+for+Async+Sink">FLIP-242</a>).
With this change, users can now customise how the AsyncSinkBase dynamically
adjusts the request rate in real-time to optimise throughput whilst mitigating
back pressure. Example customisations includ [...]
+
+<h2
id="rationale-behind-the-ratelimitingstrategy-interface">Rationale
behind the RateLimitingStrategy interface</h2>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">public</span> <span
class="kd">interface</span> <span
class="nc">RateLimitingStrategy</span> <span
class="o">{</span>
+
+ <span class="c1">// Information provided to the
RateLimitingStrategy</span>
+ <span class="kt">void</span> <span
class="nf">registerInFlightRequest</span><span
class="o">(</span><span
class="n">RequestInfo</span> <span
class="n">requestInfo</span><span
class="o">);</span>
+ <span class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span
class="o">(</span><span
class="n">ResultInfo</span> <span
class="n">resultInfo</span><span
class="o">);</span>
+
+ <span class="c1">// Controls offered to the
RateLimitingStrategy</span>
+ <span class="kt">boolean</span> <span
class="nf">shouldBlock</span><span
class="o">(</span><span
class="n">RequestInfo</span> <span
class="n">requestInfo</span><span
class="o">);</span>
+ <span class="kt">int</span> <span
class="nf">getMaxBatchSize</span><span
class="o">();</span>
+
+<span
class="o">}</span></code></pre></div>
+
+<p>There are 2 core ideas behind the RateLimitingStrategy
interface:</p>
+
+<ul>
+ <li><strong>Information methods:</strong> We need methods
to provide the RateLimitingStrategy with sufficient information to track the
rate of requests or rate of sent messages (each request can comprise multiple
messages)</li>
+ <li><strong>Control methods:</strong> We also need methods
to allow the RateLimitingStrategy to control the sink’s request rate.</li>
+</ul>
+
+<p>These are the type of methods that we see in the RateLimitingStrategy
interface. With <code>registerInFlightRequest()</code> and
<code>registerCompletedRequest()</code>, the RateLimitingStrategy
has sufficient information to track the number in-flight requests and messages,
as well as the rate of these requests.</p>
+
+<p>With <code>shouldBlock()</code>, the RateLimitingStrategy
can decide to postpone new requests until a specified condition is met (e.g.
current in-flight requests must not exceed a given number). This allows the
RateLimitingStrategy to control the rate of requests to the destination. It can
decide to increase throughput or to increase backpressure in the Flink job
graph.</p>
+
+<p>With <code>getMaxBatchSize()</code>, the
RateLimitingStrategy can dynamically adjust the number of messages packaged
into a single request. This can be useful to optimise sink throughput if the
request size affects the destination’s performance.</p>
+
+<h2
id="implementing-a-custom-ratelimitingstrategy">Implementing a
custom RateLimitingStrategy</h2>
+
+<h3
id="example-1--congestioncontrolratelimitingstrategy">[Example 1]
CongestionControlRateLimitingStrategy</h3>
+
+<p>The AsyncSinkBase comes pre-packaged with the
CongestionControlRateLimitingStrategy. In this section, we explore its
implementation.</p>
+
+<p>This strategy is modelled after <a
href="https://en.wikipedia.org/wiki/TCP_congestion_control">TCP
congestion control</a>, and aims to discover a destination’s highest
possible request rate. It achieves this by increasing the request rate until
the sink is throttled by the destination, at which point it will reduce the
request rate.</p>
+
+<p>In this RateLimitingStrategy, we want to dynamically adjust the
request rate by:</p>
+
+<ul>
+ <li>Setting a maximum number of in-flight requests at any
time</li>
+ <li>Setting a maximum number of in-flight messages at any time (each
request can comprise multiple messages)</li>
+ <li>Increasing the maximum number of in-flight messages after each
successful request, to maximise the request rate</li>
+ <li>Decreasing the maximum number of in-flight messages after an
unsuccessful request, to prevent overloading the destination</li>
+ <li>Independently keeping track of the maximum number of in-flight
messages if there are multiple sink subtasks</li>
+</ul>
+
+<p>This strategy means we will start with a low request rate (slow
start), but aggressively increase it until the destination throttles us, which
allows us to discover the highest possible request rate. It will also adjust
the request rate if the conditions of the destination changes (e.g. another
client starts writing to the same destination). This strategy works well if
destinations implement traffic shaping and throttles once the bandwidth limit
is reached (e.g. Amazon Kinesis D [...]
+
+<p>First, we implement the information methods to keep track of the
number of in-flight requests and in-flight messages.</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">public</span> <span
class="kd">class</span> <span
class="nc">CongestionControlRateLimitingStrategy</span>
<span class="kd">implements</span> <span
class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span
class="kt">void</span> <span
class="nf">registerInFlightRequest</span><span
class="o">(</span><span
class="n">RequestInfo</span> <span
class="n">requestInfo</span><span
class="o">)</span> <span
class="o">{</span>
+ <span
class="n">currentInFlightRequests</span><span
class="o">++;</span>
+ <span class="n">currentInFlightMessages</span>
<span class="o">+=</span> <span
class="n">requestInfo</span><span
class="o">.</span><span
class="na">getBatchSize</span><span
class="o">();</span>
+ <span class="o">}</span>
+
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span
class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span
class="o">(</span><span
class="n">ResultInfo</span> <span
class="n">resultInfo</span><span
class="o">)</span> <span
class="o">{</span>
+ <span class="n">currentInFlightRequests</span>
<span class="o">=</span> <span
class="n">Math</span><span
class="o">.</span><span
class="na">max</span><span
class="o">(</span><span
class="mi">0</span><span
class="o">,</span> <span
class="n">currentInFlightRequests</span> <span class=
[...]
+ <span class="n">currentInFlightMessages</span>
<span class="o">=</span> <span
class="n">Math</span><span
class="o">.</span><span
class="na">max</span><span
class="o">(</span><span
class="mi">0</span><span
class="o">,</span> <span
class="n">currentInFlightMessages</span> <span class=
[...]
+
+ <span class="k">if</span> <span
class="o">(</span><span
class="n">resultInfo</span><span
class="o">.</span><span
class="na">getFailedMessages</span><span
class="o">()</span> <span
class="o">&gt;</span> <span
class="mi">0</span><span
class="o">)</span> <span class="o"&g [...]
+ <span class="n">maxInFlightMessages</span>
<span class="o">=</span> <span
class="n">scalingStrategy</span><span
class="o">.</span><span
class="na">scaleDown</span><span
class="o">(</span><span
class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span> <span
class="k">else</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span>
<span class="o">=</span> <span
class="n">scalingStrategy</span><span
class="o">.</span><span
class="na">scaleUp</span><span
class="o">(</span><span
class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span>
+ <span class="o">}</span>
+ <span class="c1">// ...</span>
+<span
class="o">}</span></code></pre></div>
+
+<p>Then we implement the control methods to dynamically adjust the
request rate.</p>
+
+<p>We keep a current value for maxInFlightMessages and
maxInFlightRequests, and postpone all new requests if maxInFlightRequests or
maxInFlightMessages have been reached.</p>
+
+<p>Every time a request completes, the
CongestionControlRateLimitingStrategy will check if there are any failed
messages in the response. If there are, it will decrease maxInFlightMessages.
If there are no failed messages, it will increase maxInFlightMessages. This
gives us indirect control of the rate of messages being written to the
destination.</p>
+
+<p>Side note: The default CongestionControlRateLimitingStrategy uses an
Additive Increase / Multiplicative Decrease (AIMD) scaling strategy. This is
also used in TCP congestion control to avoid overloading the destination by
increasing write rate slowly but backing off quickly if throttled.</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">public</span> <span
class="kd">class</span> <span
class="nc">CongestionControlRateLimitingStrategy</span>
<span class="kd">implements</span> <span
class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span
class="kt">void</span> <span
class="nf">registerCompletedRequest</span><span
class="o">(</span><span
class="n">ResultInfo</span> <span
class="n">resultInfo</span><span
class="o">)</span> <span
class="o">{</span>
+ <span class="c1">// ...</span>
+ <span class="k">if</span> <span
class="o">(</span><span
class="n">resultInfo</span><span
class="o">.</span><span
class="na">getFailedMessages</span><span
class="o">()</span> <span
class="o">&gt;</span> <span
class="mi">0</span><span
class="o">)</span> <span class="o"&g [...]
+ <span class="n">maxInFlightMessages</span>
<span class="o">=</span> <span
class="n">scalingStrategy</span><span
class="o">.</span><span
class="na">scaleDown</span><span
class="o">(</span><span
class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span> <span
class="k">else</span> <span
class="o">{</span>
+ <span class="n">maxInFlightMessages</span>
<span class="o">=</span> <span
class="n">scalingStrategy</span><span
class="o">.</span><span
class="na">scaleUp</span><span
class="o">(</span><span
class="n">maxInFlightMessages</span><span
class="o">);</span>
+ <span class="o">}</span>
+ <span class="o">}</span>
+
+ <span class="kd">public</span> <span
class="kt">boolean</span> <span
class="nf">shouldBlock</span><span
class="o">(</span><span
class="n">RequestInfo</span> <span
class="n">requestInfo</span><span
class="o">)</span> <span
class="o">{</span>
+ <span class="k">return</span> <span
class="n">currentInFlightRequests</span> <span
class="o">&gt;=</span> <span
class="n">maxInFlightRequests</span>
+ <span class="o">||</span> <span
class="o">(</span><span
class="n">currentInFlightMessages</span> <span
class="o">+</span> <span
class="n">requestInfo</span><span
class="o">.</span><span
class="na">getBatchSize</span><span
class="o">()</span> <span
class="o">&gt;</span> < [...]
+ <span class="o">}</span>
+ <span class="c1">// ...</span>
+<span
class="o">}</span></code></pre></div>
+
+<h3 id="example-2-tokenbucketratelimitingstrategy">[Example 2]
TokenBucketRateLimitingStrategy</h3>
+
+<p>The CongestionControlRateLimitingStrategy is rather aggressive, and
relies on a robust server-side rate limiting strategy. In the event we don’t
have a robust server-side rate limiting strategy, we can implement a
client-side rate limiting strategy.</p>
+
+<p>As an example, we can look at the <a
href="https://en.wikipedia.org/wiki/Token_bucket">token bucket
rate limiting strategy</a>. This strategy allows us to set the exact rate
of the sink (e.g. requests per second, messages per second). If the limits are
set correctly, we will avoid overloading the destination altogether.</p>
+
+<p>In this strategy, we want to do the following:</p>
+
+<ul>
+ <li>Implement a TokenBucket that has a given initial number of tokens
(e.g. 10). These tokens refill at a given rate (e.g. 1 token per
second).</li>
+ <li>When preparing an async request, we check if the token bucket has
sufficient tokens. If not, we postpone the request.</li>
+</ul>
+
+<p>Let’s look at an example implementation:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">public</span> <span
class="kd">class</span> <span
class="nc">TokenBucketRateLimitingStrategy</span> <span
class="kd">implements</span> <span
class="n">RateLimitingStrategy</span> <span
class="o">{</span>
+
+ <span class="kd">private</span> <span
class="kd">final</span> <span
class="n">Bucket</span> <span
class="n">bucket</span><span
class="o">;</span>
+
+ <span class="kd">public</span> <span
class="nf">TokenBucketRateLimitingStrategy</span><span
class="o">()</span> <span
class="o">{</span>
+ <span class="n">Refill</span> <span
class="n">refill</span> <span
class="o">=</span> <span
class="n">Refill</span><span
class="o">.</span><span
class="na">intervally</span><span
class="o">(</span><span
class="mi">1</span><span
class="o">,</span> <span class="n">Duration
[...]
+ <span class="n">Bandwidth</span> <span
class="n">limit</span> <span
class="o">=</span> <span
class="n">Bandwidth</span><span
class="o">.</span><span
class="na">classic</span><span
class="o">(</span><span
class="mi">10</span><span
class="o">,</span> <span class="n">refil
[...]
+ <span class="k">this</span><span
class="o">.</span><span
class="na">bucket</span> <span
class="o">=</span> <span
class="n">Bucket4j</span><span
class="o">.</span><span
class="na">builder</span><span
class="o">()</span>
+ <span class="o">.</span><span
class="na">addLimit</span><span
class="o">(</span><span
class="n">limit</span><span
class="o">)</span>
+ <span class="o">.</span><span
class="na">build</span><span
class="o">();</span>
+ <span class="o">}</span>
+
+ <span class="c1">// ... (information methods not
needed)</span>
+
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span
class="kt">boolean</span> <span
class="nf">shouldBlock</span><span
class="o">(</span><span
class="n">RequestInfo</span> <span
class="n">requestInfo</span><span
class="o">)</span> <span
class="o">{</span>
+ <span class="k">return</span> <span
class="n">bucket</span><span
class="o">.</span><span
class="na">tryConsume</span><span
class="o">(</span><span
class="n">requestInfo</span><span
class="o">.</span><span
class="na">getBatchSize</span><span
class="o">());</span>
+ <span class="o">}</span>
+
+<span
class="o">}</span></code></pre></div>
+
+<p>In the above example, we use the <a
href="https://github.com/bucket4j/bucket4j">Bucket4j</a>
library’s Token Bucket implementation. We also map 1 message to 1 token. Since
our token bucket has a size of 10 tokens and a refill rate of 1 token per
second, we can be sure that we will not exceed a burst of 10 messages, and will
also not exceed a constant rate of 1 message per second.</p>
+
+<p>This would be useful if we know that our destination will failover
ungracefully if a rate of 1 message per second is exceeded, or if we
intentionally want to limit our sink’s throughput to provide higher bandwidth
for other clients writing to the same destination.</p>
+
+<h2 id="specifying-a-custom-ratelimitingstrategy">Specifying a
custom RateLimitingStrategy</h2>
+
+<p>To specify a custom RateLimitingStrategy, we have to specify it in
the AsyncSinkWriterConfiguration which is passed into the constructor of the
AsyncSinkWriter. For example:</p>
+
+<div class="highlight"><pre><code
class="language-java"><span
class="kd">class</span> <span
class="nc">MyCustomSinkWriter</span><span
class="o">&lt;</span><span
class="n">InputT</span><span
class="o">&gt;</span> <span
class="kd">extends</span> <span
class="n">AsyncSinkWriter</span><span c [...]
+
+ <span class="n">MyCustomSinkWriter</span><span
class="o">(</span>
+ <span class="n">ElementConverter</span><span
class="o">&lt;</span><span
class="n">InputT</span><span
class="o">,</span> <span
class="n">MyCustomRequestEntry</span><span
class="o">&gt;</span> <span
class="n">elementConverter</span><span
class="o">,</span>
+ <span class="n">Sink</span><span
class="o">.</span><span
class="na">InitContext</span> <span
class="n">context</span><span
class="o">,</span>
+ <span class="n">Collection</span><span
class="o">&lt;</span><span
class="n">BufferedRequestState</span><span
class="o">&lt;</span><span
class="n">MyCustomRequestEntry</span><span
class="o">&gt;&gt;</span> <span
class="n">states</span><span
class="o">)</span> <span class="o">{ [...]
+ <span class="kd">super</span><span
class="o">(</span>
+ <span
class="n">elementConverter</span><span
class="o">,</span>
+ <span class="n">context</span><span
class="o">,</span>
+ <span
class="n">AsyncSinkWriterConfiguration</span><span
class="o">.</span><span
class="na">builder</span><span
class="o">()</span>
+ <span class="c1">// ...</span>
+ <span class="o">.</span><span
class="na">setRateLimitingStrategy</span><span
class="o">(</span><span
class="k">new</span> <span
class="nf">TokenBucketRateLimitingStrategy</span><span
class="o">())</span>
+ <span class="o">.</span><span
class="na">build</span><span
class="o">(),</span>
+ <span class="n">states</span><span
class="o">);</span>
+ <span class="o">}</span>
+
+<span
class="o">}</span></code></pre></div>
+
+<h2 id="summary">Summary</h2>
+
+<p>From Apache Flink 1.16 we can customise the RateLimitingStrategy used
to dynamically adjust the behaviour of the Async Sink at runtime. This allows
users to tune their connector implementations based on specific use cases and
needs, without having to understand the base sink’s low-level
workings.</p>
+
+<p>We hope this extension will be useful for you. If you have any
feedback, feel free to reach out!</p>
+
+</description>
+<pubDate>Fri, 25 Nov 2022 13:00:00 +0100</pubDate>
+<link>https://flink.apache.org/2022/11/25/async-sink-rate-limiting-strategy.html</link>
+<guid
isPermaLink="true">/2022/11/25/async-sink-rate-limiting-strategy.html</guid>
+</item>
+
<item>
<title>Announcing the Release of Apache Flink 1.16</title>
<description><p>Apache Flink continues to grow at a rapid pace and is
one of the most active
@@ -19789,131 +19983,5 @@ The upcoming release will focus on new features and
integrations that broaden th
<guid isPermaLink="true">/news/2020/04/01/community-update.html</guid>
</item>
-<item>
-<title>Flink as Unified Engine for Modern Data Warehousing: Production-Ready
Hive Integration</title>
-<description><p>In this blog post, you will learn our motivation behind
the Flink-Hive integration, and how Flink 1.10 can help modernize your data
warehouse.</p>
-
-<div class="page-toc">
-<ul id="markdown-toc">
- <li><a href="#introduction"
id="markdown-toc-introduction">Introduction</a></li>
- <li><a
href="#flink-and-its-integration-with-hive-comes-into-the-scene"
id="markdown-toc-flink-and-its-integration-with-hive-comes-into-the-scene">Flink
and Its Integration With Hive Comes into the Scene</a> <ul>
- <li><a href="#unified-metadata-management"
id="markdown-toc-unified-metadata-management">Unified Metadata
Management</a></li>
- <li><a href="#stream-processing"
id="markdown-toc-stream-processing">Stream
Processing</a></li>
- <li><a href="#compatible-with-more-hive-versions"
id="markdown-toc-compatible-with-more-hive-versions">Compatible
with More Hive Versions</a></li>
- <li><a href="#reuse-hive-user-defined-functions-udfs"
id="markdown-toc-reuse-hive-user-defined-functions-udfs">Reuse
Hive User Defined Functions (UDFs)</a></li>
- <li><a href="#enhanced-read-and-write-on-hive-data"
id="markdown-toc-enhanced-read-and-write-on-hive-data">Enhanced
Read and Write on Hive Data</a></li>
- <li><a href="#formats"
id="markdown-toc-formats">Formats</a></li>
- <li><a href="#more-data-types"
id="markdown-toc-more-data-types">More Data
Types</a></li>
- <li><a href="#roadmap"
id="markdown-toc-roadmap">Roadmap</a></li>
- </ul>
- </li>
- <li><a href="#summary"
id="markdown-toc-summary">Summary</a></li>
-</ul>
-
-</div>
-
-<h2 id="introduction">Introduction</h2>
-
-<p>What are some of the latest requirements for your data warehouse and
data infrastructure in 2020?</p>
-
-<p>We’ve came up with some for you.</p>
-
-<p>Firstly, today’s business is shifting to a more real-time fashion,
and thus demands abilities to process online streaming data with low latency
for near-real-time or even real-time analytics. People become less and less
tolerant of delays between when data is generated and when it arrives at their
hands, ready to use. Hours or even days of delay is not acceptable anymore.
Users are expecting minutes, or even seconds, of end-to-end latency for data in
their warehouse, to get quic [...]
-
-<p>Secondly, the infrastructure should be able to handle both offline
batch data for offline analytics and exploration, and online streaming data for
more timely analytics. Both are indispensable as they both have very valid use
cases. Apart from the real time processing mentioned above, batch processing
would still exist as it’s good for ad hoc queries and explorations, and
full-size calculations. Your modern infrastructure should not force users to
choose between one or the other [...]
-
-<p>Thirdly, the data players, including data engineers, data scientists,
analysts, and operations, urge a more unified infrastructure than ever before
for easier ramp-up and higher working efficiency. The big data landscape has
been fragmented for years - companies may have one set of infrastructure for
real time processing, one set for batch, one set for OLAP, etc. That,
oftentimes, comes as a result of the legacy of lambda architecture, which was
popular in the era when stream pr [...]
-
-<p>If any of these resonate with you, you just found the right post to
read: we have never been this close to the vision by strengthening Flink’s
integration with Hive to a production grade.</p>
-
-<h2
id="flink-and-its-integration-with-hive-comes-into-the-scene">Flink
and Its Integration With Hive Comes into the Scene</h2>
-
-<p>Apache Flink has been a proven scalable system to handle extremely
high workload of streaming data in super low latency in many giant tech
companies.</p>
-
-<p>Despite its huge success in the real time processing domain, at its
deep root, Flink has been faithfully following its inborn philosophy of being
<a
href="https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html">a
unified data processing engine for both batch and streaming</a>, and
taking a streaming-first approach in its architecture to do batch processing.
By making batch a special case for streaming, Flink really leverages its
cutting [...]
-
-<p>On the other hand, Apache Hive has established itself as a focal
point of the data warehousing ecosystem. It serves as not only a SQL engine for
big data analytics and ETL, but also a data management platform, where data is
discovered and defined. As business evolves, it puts new requirements on data
warehouse.</p>
-
-<p>Thus we started integrating Flink and Hive as a beta version in Flink
1.9. Over the past few months, we have been listening to users’ requests and
feedback, extensively enhancing our product, and running rigorous benchmarks
(which will be published soon separately). I’m glad to announce that the
integration between Flink and Hive is at production grade in <a
href="https://flink.apache.org/news/2020/02/11/release-1.10.0.html">Flink
1.10</a> and we can’t wait [...]
-
-<h3 id="unified-metadata-management">Unified Metadata
Management</h3>
-
-<p>Hive Metastore has evolved into the de facto metadata hub over the
years in the Hadoop, or even the cloud, ecosystem. Many companies have a single
Hive Metastore service instance in production to manage all of their schemas,
either Hive or non-Hive metadata, as the single source of truth.</p>
-
-<p>In 1.9 we introduced Flink’s <a
href="https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_catalog.html">HiveCatalog</a>,
connecting Flink to users’ rich metadata pool. The meaning of
<code>HiveCatalog</code> is two-fold here. First, it allows Apache
Flink users to utilize Hive Metastore to store and manage Flink’s metadata,
including tables, UDFs, and statistics of data. Second, it enables Flink to
access Hive’s existi [...]
-
-<p>In Flink 1.10, users can store Flink’s own tables, views, UDFs,
statistics in Hive Metastore on all of the compatible Hive versions mentioned
above. <a
href="https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_catalog.html#example">Here’s
an end-to-end example</a> of how to store a Flink’s Kafka source table
in Hive Metastore and later query the table in Flink SQL.</p>
-
-<h3 id="stream-processing">Stream Processing</h3>
-
-<p>The Hive integration feature in Flink 1.10 empowers users to
re-imagine what they can accomplish with their Hive data and unlock stream
processing use cases:</p>
-
-<ul>
- <li>join real-time streaming data in Flink with offline Hive data for
more complex data processing</li>
- <li>backfill Hive data with Flink directly in a unified
fashion</li>
- <li>leverage Flink to move real-time data into Hive more quickly,
greatly shortening the end-to-end latency between when data is generated and
when it arrives at your data warehouse for analytics, from hours — or even days
— to minutes</li>
-</ul>
-
-<h3 id="compatible-with-more-hive-versions">Compatible with
More Hive Versions</h3>
-
-<p>In Flink 1.10, we brought full coverage to most Hive versions
including 1.0, 1.1, 1.2, 2.0, 2.1, 2.2, 2.3, and 3.1. Take a look <a
href="https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/#supported-hive-versions">here</a>.</p>
-
-<h3 id="reuse-hive-user-defined-functions-udfs">Reuse Hive
User Defined Functions (UDFs)</h3>
-
-<p>Users can <a
href="https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_functions.html#hive-user-defined-functions">reuse
all kinds of Hive UDFs in Flink</a> since Flink 1.9.</p>
-
-<p>This is a great win for Flink users with past history with the Hive
ecosystem, as they may have developed custom business logic in their Hive UDFs.
Being able to run these functions without any rewrite saves users a lot of time
and brings them a much smoother experience when they migrate to Flink.</p>
-
-<p>To take it a step further, Flink 1.10 introduces <a
href="https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_functions.html#use-hive-built-in-functions-via-hivemodule">compatibility
of Hive built-in functions via HiveModule</a>. Over the years, the Hive
community has developed a few hundreds of built-in functions that are super
handy for users. For those built-in functions that don’t exist in Flink yet,
users are now able to leve [...]
-
-<h3 id="enhanced-read-and-write-on-hive-data">Enhanced Read
and Write on Hive Data</h3>
-
-<p>Flink 1.10 extends its read and write capabilities on Hive data to
all the common use cases with better performance.</p>
-
-<p>On the reading side, Flink now can read Hive regular tables,
partitioned tables, and views. Lots of optimization techniques are developed
around reading, including partition pruning and projection pushdown to
transport less data from file storage, limit pushdown for faster experiment and
exploration, and vectorized reader for ORC files.</p>
-
-<p>On the writing side, Flink 1.10 introduces “INSERT INTO” and “INSERT
OVERWRITE” to its syntax, and can write to not only Hive’s regular tables, but
also partitioned tables with either static or dynamic partitions.</p>
-
-<h3 id="formats">Formats</h3>
-
-<p>Your engine should be able to handle all common types of file formats
to give you the freedom of choosing one over another in order to fit your
business needs. It’s no exception for Flink. We have tested the following table
storage formats: text, csv, SequenceFile, ORC, and Parquet.</p>
-
-<h3 id="more-data-types">More Data Types</h3>
-
-<p>In Flink 1.10, we added support for a few more frequently-used Hive
data types that were not covered by Flink 1.9. Flink users now should have a
full, smooth experience to query and manipulate Hive data from Flink.</p>
-
-<h3 id="roadmap">Roadmap</h3>
-
-<p>Integration between any two systems is a never-ending story.</p>
-
-<p>We are constantly improving Flink itself and the Flink-Hive
integration also gets improved by collecting user feedback and working with
folks in this vibrant community.</p>
-
-<p>After careful consideration and prioritization of the feedback we
received, we have prioritize many of the below requests for the next Flink
release of 1.11.</p>
-
-<ul>
- <li>Hive streaming sink so that Flink can stream data into Hive
tables, bringing a real streaming experience to Hive</li>
- <li>Native Parquet reader for better performance</li>
- <li>Additional interoperability - support creating Hive tables, views,
functions in Flink</li>
- <li>Better out-of-box experience with built-in dependencies, including
documentations</li>
- <li>JDBC driver so that users can reuse their existing toolings to run
SQL jobs on Flink</li>
- <li>Hive syntax and semantic compatible mode</li>
-</ul>
-
-<p>If you have more feature requests or discover bugs, please reach out
to the community through mailing list and JIRAs.</p>
-
-<h2 id="summary">Summary</h2>
-
-<p>Data warehousing is shifting to a more real-time fashion, and Apache
Flink can make a difference for your organization in this space.</p>
-
-<p>Flink 1.10 brings production-ready Hive integration and empowers
users to achieve more in both metadata management and unified/batch data
processing.</p>
-
-<p>We encourage all our users to get their hands on Flink 1.10. You are
very welcome to join the community in development, discussions, and all other
kinds of collaborations in this topic.</p>
-
-</description>
-<pubDate>Fri, 27 Mar 2020 03:30:00 +0100</pubDate>
-<link>https://flink.apache.org/features/2020/03/27/flink-for-data-warehouse.html</link>
-<guid
isPermaLink="true">/features/2020/03/27/flink-for-data-warehouse.html</guid>
-</item>
-
</channel>
</rss>
diff --git a/content/blog/index.html b/content/blog/index.html
index 41c7e5357..b5641f787 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></h2>
+
+ <p>25 Nov 2022
+ Hong Liang Teoh </p>
+
+ <p>An overview of how to optimise the throughput of async sinks using a
custom RateLimitingStrategy</p>
+
+ <p><a href="/2022/11/25/async-sink-rate-limiting-strategy.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2022/10/28/1.16-announcement.html">Announcing the Release of Apache
Flink 1.16</a></h2>
@@ -367,19 +380,6 @@ with 19 FLIPs and 1100+ issues completed, bringing a lot
of exciting features to
<hr>
- <article>
- <h2 class="blog-title"><a
href="/2022/07/11/final-checkpoint-part2.html">FLIP-147: Support Checkpoints
After Tasks Finished - Part Two</a></h2>
-
- <p>11 Jul 2022
- Yun Gao , Dawid Wysakowicz , & Daisy Tsang </p>
-
- <p>This post presents more details on the changes on the checkpoint
procedure and task finish process made by the final checkpoint mechanism.</p>
-
- <p><a href="/2022/07/11/final-checkpoint-part2.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -412,6 +412,16 @@ with 19 FLIPs and 1100+ issues completed, bringing a lot
of exciting features to
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html
index 6aa66a3f7..9ea8b2c6b 100644
--- a/content/blog/page10/index.html
+++ b/content/blog/page10/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0
Release Announcement</a></h2>
+
+ <p>09 Jun 2020
+ Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+ <p><p>The Apache Flink community is happy to announce the release of
Stateful Functions (StateFun) 2.1.0! This release introduces new features
around state expiration and performance improvements for co-located
deployments, as well as other important changes that improve the stability and
testability of the project. As the community around StateFun grows, the release
cycle will follow this pattern of smaller and more frequent releases to
incorporate user feedback and allow for fast [...]
+
+</p>
+
+ <p><a href="/news/2020/06/09/release-statefun-2.1.0.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2020/05/12/release-1.10.1.html">Apache Flink 1.10.1
Released</a></h2>
@@ -364,21 +379,6 @@ This release marks a big milestone: Stateful Functions 2.0
is not only an API up
<hr>
- <article>
- <h2 class="blog-title"><a
href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified
Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2>
-
- <p>27 Mar 2020
- Bowen Li (<a href="https://twitter.com/Bowen__Li">@Bowen__Li</a>)</p>
-
- <p><p>In this blog post, you will learn our motivation behind the
Flink-Hive integration, and how Flink 1.10 can help modernize your data
warehouse.</p>
-
-</p>
-
- <p><a href="/features/2020/03/27/flink-for-data-warehouse.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -411,6 +411,16 @@ This release marks a big milestone: Stateful Functions 2.0
is not only an API up
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html
index 7f79b8e82..d6a792354 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page11/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified
Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2>
+
+ <p>27 Mar 2020
+ Bowen Li (<a href="https://twitter.com/Bowen__Li">@Bowen__Li</a>)</p>
+
+ <p><p>In this blog post, you will learn our motivation behind the
Flink-Hive integration, and how Flink 1.10 can help modernize your data
warehouse.</p>
+
+</p>
+
+ <p><a href="/features/2020/03/27/flink-for-data-warehouse.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2020/03/24/demo-fraud-detection-2.html">Advanced Flink Application
Patterns Vol.2: Dynamic Updates of Application Logic</a></h2>
@@ -363,19 +378,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on
Kubernetes with KUDO</a></h2>
-
- <p>09 Dec 2019
- Gerred Dillon </p>
-
- <p>A common use case for Apache Flink is streaming data analytics
together with Apache Kafka, which provides a pub/sub model and durability for
data streams. In this post, we demonstrate how to orchestrate Flink and Kafka
with KUDO.</p>
-
- <p><a href="/news/2019/12/09/flink-kubernetes-kudo.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -408,6 +410,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html
index b41125b60..1c937f2f4 100644
--- a/content/blog/page12/index.html
+++ b/content/blog/page12/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on
Kubernetes with KUDO</a></h2>
+
+ <p>09 Dec 2019
+ Gerred Dillon </p>
+
+ <p>A common use case for Apache Flink is streaming data analytics
together with Apache Kafka, which provides a pub/sub model and durability for
data streams. In this post, we demonstrate how to orchestrate Flink and Kafka
with KUDO.</p>
+
+ <p><a href="/news/2019/12/09/flink-kubernetes-kudo.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2019/11/25/query-pulsar-streams-using-apache-flink.html">How to
query Pulsar Streams using Apache Flink</a></h2>
@@ -366,19 +379,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A
Deep-Dive into Flink's Network Stack</a></h2>
-
- <p>05 Jun 2019
- Nico Kruber </p>
-
- <p>Flink’s network stack is one of the core components that make up
Apache Flink's runtime module sitting at the core of every Flink job. In this
post, which is the first in a series of posts about the network stack, we look
at the abstractions exposed to the stream operators and detail their physical
implementation and various optimisations in Apache Flink.</p>
-
- <p><a href="/2019/06/05/flink-network-stack.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -411,6 +411,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html
index 22c25acb5..03ea093f0 100644
--- a/content/blog/page13/index.html
+++ b/content/blog/page13/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A
Deep-Dive into Flink's Network Stack</a></h2>
+
+ <p>05 Jun 2019
+ Nico Kruber </p>
+
+ <p>Flink’s network stack is one of the core components that make up
Apache Flink's runtime module sitting at the core of every Flink job. In this
post, which is the first in a series of posts about the network stack, we look
at the abstractions exposed to the stream operators and detail their physical
implementation and various optimisations in Apache Flink.</p>
+
+ <p><a href="/2019/06/05/flink-network-stack.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a href="/2019/05/19/state-ttl.html">State TTL in
Flink 1.8.0: How to Automatically Cleanup Application State in Apache
Flink</a></h2>
@@ -367,21 +380,6 @@ for more details.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2>
-
- <p>15 Feb 2019
- </p>
-
- <p><p>The Apache Flink community released the second bugfix version of
the Apache Flink 1.7 series.</p>
-
-</p>
-
- <p><a href="/news/2019/02/15/release-1.7.2.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -414,6 +412,16 @@ for more details.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html
index cfa152d08..b747b5936 100644
--- a/content/blog/page14/index.html
+++ b/content/blog/page14/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2>
+
+ <p>15 Feb 2019
+ </p>
+
+ <p><p>The Apache Flink community released the second bugfix version of
the Apache Flink 1.7 series.</p>
+
+</p>
+
+ <p><a href="/news/2019/02/15/release-1.7.2.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2019/02/13/unified-batch-streaming-blink.html">Batch as a Special
Case of Streaming and Alibaba's contribution of Blink</a></h2>
@@ -375,21 +390,6 @@ Please check the <a
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2>
-
- <p>21 Aug 2018
- </p>
-
- <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.5 series.</p>
-
-</p>
-
- <p><a href="/news/2018/08/21/release-1.5.3.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -422,6 +422,16 @@ Please check the <a
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page15/index.html b/content/blog/page15/index.html
index 75cc1853b..4b911eb8c 100644
--- a/content/blog/page15/index.html
+++ b/content/blog/page15/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2>
+
+ <p>21 Aug 2018
+ </p>
+
+ <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.5 series.</p>
+
+</p>
+
+ <p><a href="/news/2018/08/21/release-1.5.3.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2018/08/09/release-1.6.0.html">Apache Flink 1.6.0 Release
Announcement</a></h2>
@@ -371,19 +386,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in
Review</a></h2>
-
- <p>21 Dec 2017
- Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)
& Mike Winters (<a href="https://twitter.com/wints">@wints</a>)</p>
-
- <p>As 2017 comes to a close, let's take a moment to look back on the
Flink community's great work during the past year.</p>
-
- <p><a href="/news/2017/12/21/2017-year-in-review.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -416,6 +418,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page16/index.html b/content/blog/page16/index.html
index 3e61875fd..d969397fa 100644
--- a/content/blog/page16/index.html
+++ b/content/blog/page16/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in
Review</a></h2>
+
+ <p>21 Dec 2017
+ Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)
& Mike Winters (<a href="https://twitter.com/wints">@wints</a>)</p>
+
+ <p>As 2017 comes to a close, let's take a moment to look back on the
Flink community's great work during the past year.</p>
+
+ <p><a href="/news/2017/12/21/2017-year-in-review.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2017/12/12/release-1.4.0.html">Apache Flink 1.4.0 Release
Announcement</a></h2>
@@ -377,19 +390,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what
the Flink community
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and
Back Again: An Update on Flink's Table & SQL API</a></h2>
-
- <p>29 Mar 2017 by Timo Walther (<a
href="https://twitter.com/">@twalthr</a>)
- </p>
-
- <p><p>Broadening the user base and unifying batch & streaming with
relational APIs</p></p>
-
- <p><a href="/news/2017/03/29/table-sql-api-update.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -422,6 +422,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what
the Flink community
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page17/index.html b/content/blog/page17/index.html
index 895384071..ed761461a 100644
--- a/content/blog/page17/index.html
+++ b/content/blog/page17/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and
Back Again: An Update on Flink's Table & SQL API</a></h2>
+
+ <p>29 Mar 2017 by Timo Walther (<a
href="https://twitter.com/">@twalthr</a>)
+ </p>
+
+ <p><p>Broadening the user base and unifying batch & streaming with
relational APIs</p></p>
+
+ <p><a href="/news/2017/03/29/table-sql-api-update.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2017/03/23/release-1.1.5.html">Apache Flink 1.1.5 Released</a></h2>
@@ -371,20 +384,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream
Processing for Everyone with SQL and Apache Flink</a></h2>
-
- <p>24 May 2016 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
- </p>
-
- <p><p>About six months ago, the Apache Flink community started an effort
to add a SQL interface for stream data analysis. SQL is <i>the</i> standard
language to access and process data. Everybody who occasionally analyzes data
is familiar with SQL. Consequently, a SQL interface for stream data processing
will make this technology accessible to a much wider audience. Moreover, SQL
support for streaming data will also enable new use cases such as interactive
and ad-hoc stream analysi [...]
-<p>In this blog post, we report on the current status, architectural design,
and future plans of the Apache Flink community to implement support for SQL as
a language for analyzing data streams.</p></p>
-
- <p><a href="/news/2016/05/24/stream-sql.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -417,6 +416,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page18/index.html b/content/blog/page18/index.html
index ba0cda9cc..0571d4a32 100644
--- a/content/blog/page18/index.html
+++ b/content/blog/page18/index.html
@@ -240,6 +240,20 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream
Processing for Everyone with SQL and Apache Flink</a></h2>
+
+ <p>24 May 2016 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
+ </p>
+
+ <p><p>About six months ago, the Apache Flink community started an effort
to add a SQL interface for stream data analysis. SQL is <i>the</i> standard
language to access and process data. Everybody who occasionally analyzes data
is familiar with SQL. Consequently, a SQL interface for stream data processing
will make this technology accessible to a much wider audience. Moreover, SQL
support for streaming data will also enable new use cases such as interactive
and ad-hoc stream analysi [...]
+<p>In this blog post, we report on the current status, architectural design,
and future plans of the Apache Flink community to implement support for SQL as
a language for analyzing data streams.</p></p>
+
+ <p><a href="/news/2016/05/24/stream-sql.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2016/05/11/release-1.0.3.html">Flink 1.0.3 Released</a></h2>
@@ -369,20 +383,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in
Apache Flink</a></h2>
-
- <p>04 Dec 2015 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
- </p>
-
- <p><p>The data analysis space is witnessing an evolution from batch to
stream processing for many use cases. Although batch can be handled as a
special case of stream processing, analyzing never-ending streaming data often
requires a shift in the mindset and comes with its own terminology (for
example, “windowing” and “at-least-once”/”exactly-once” processing). This shift
and the new terminology can be quite confusing for people being new to the
space of stream processing. Apache F [...]
-<p>In this blog post, we discuss the concept of windows for stream processing,
present Flink's built-in windows, and explain its support for custom windowing
semantics.</p></p>
-
- <p><a href="/news/2015/12/04/Introducing-windows.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -415,6 +415,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page19/index.html b/content/blog/page19/index.html
index 255f0777f..4697e619b 100644
--- a/content/blog/page19/index.html
+++ b/content/blog/page19/index.html
@@ -240,6 +240,20 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in
Apache Flink</a></h2>
+
+ <p>04 Dec 2015 by Fabian Hueske (<a
href="https://twitter.com/">@fhueske</a>)
+ </p>
+
+ <p><p>The data analysis space is witnessing an evolution from batch to
stream processing for many use cases. Although batch can be handled as a
special case of stream processing, analyzing never-ending streaming data often
requires a shift in the mindset and comes with its own terminology (for
example, “windowing” and “at-least-once”/”exactly-once” processing). This shift
and the new terminology can be quite confusing for people being new to the
space of stream processing. Apache F [...]
+<p>In this blog post, we discuss the concept of windows for stream processing,
present Flink's built-in windows, and explain its support for custom windowing
semantics.</p></p>
+
+ <p><a href="/news/2015/12/04/Introducing-windows.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2015/11/27/release-0.10.1.html">Flink 0.10.1 released</a></h2>
@@ -378,26 +392,6 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink
0.9.0-milestone1 preview release</a></h2>
-
- <p>13 Apr 2015
- </p>
-
- <p><p>The Apache Flink community is pleased to announce the availability
of
-the 0.9.0-milestone-1 release. The release is a preview of the
-upcoming 0.9.0 release. It contains many new features which will be
-available in the upcoming 0.9 release. Interested users are encouraged
-to try it out and give feedback. As the version number indicates, this
-release is a preview release that contains known issues.</p>
-
-</p>
-
- <p><a href="/news/2015/04/13/release-0.9.0-milestone1.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -430,6 +424,16 @@ release is a preview release that contains known
issues.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html
index 64c35ad74..ce9d4280a 100644
--- a/content/blog/page2/index.html
+++ b/content/blog/page2/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/2022/07/11/final-checkpoint-part2.html">FLIP-147: Support Checkpoints
After Tasks Finished - Part Two</a></h2>
+
+ <p>11 Jul 2022
+ Yun Gao , Dawid Wysakowicz , & Daisy Tsang </p>
+
+ <p>This post presents more details on the changes on the checkpoint
procedure and task finish process made by the final checkpoint mechanism.</p>
+
+ <p><a href="/2022/07/11/final-checkpoint-part2.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/2022/07/11/final-checkpoint-part1.html">FLIP-147: Support Checkpoints
After Tasks Finished - Part One</a></h2>
@@ -363,19 +376,6 @@ We are now proud to announce the first production ready
release of the operator
<hr>
- <article>
- <h2 class="blog-title"><a href="/2022/05/06/async-sink-base.html">The
Generic Asynchronous Base Sink</a></h2>
-
- <p>06 May 2022
- Zichen Liu </p>
-
- <p>An overview of the new AsyncBaseSink and how to use it for building
your own concrete sink</p>
-
- <p><a href="/2022/05/06/async-sink-base.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -408,6 +408,16 @@ We are now proud to announce the first production ready
release of the operator
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page20/index.html b/content/blog/page20/index.html
index 4e538c98c..40ee555af 100644
--- a/content/blog/page20/index.html
+++ b/content/blog/page20/index.html
@@ -240,6 +240,26 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink
0.9.0-milestone1 preview release</a></h2>
+
+ <p>13 Apr 2015
+ </p>
+
+ <p><p>The Apache Flink community is pleased to announce the availability
of
+the 0.9.0-milestone-1 release. The release is a preview of the
+upcoming 0.9.0 release. It contains many new features which will be
+available in the upcoming 0.9 release. Interested users are encouraged
+to try it out and give feedback. As the version number indicates, this
+release is a preview release that contains known issues.</p>
+
+</p>
+
+ <p><a href="/news/2015/04/13/release-0.9.0-milestone1.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2015/04/07/march-in-flink.html">March 2015 in the Flink
community</a></h2>
@@ -380,21 +400,6 @@ and offers a new API including definition of flexible
windows.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2>
-
- <p>03 Oct 2014
- </p>
-
- <p><p>We are happy to announce several upcoming Flink events both in
Europe and the US. Starting with a <strong>Flink hackathon in
Stockholm</strong> (Oct 8-9) and a talk about Flink at the <strong>Stockholm
Hadoop User Group</strong> (Oct 8). This is followed by the very first
<strong>Flink Meetup in Berlin</strong> (Oct 15). In the US, there will be two
Flink Meetup talks: the first one at the <strong>Pasadena Big Data User
Group</strong> (Oct 29) and the second one at <strong>Si [...]
-
-</p>
-
- <p><a href="/news/2014/10/03/upcoming_events.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -427,6 +432,16 @@ and offers a new API including definition of flexible
windows.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page21/index.html b/content/blog/page21/index.html
index 2e6665106..9392a30dd 100644
--- a/content/blog/page21/index.html
+++ b/content/blog/page21/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2>
+
+ <p>03 Oct 2014
+ </p>
+
+ <p><p>We are happy to announce several upcoming Flink events both in
Europe and the US. Starting with a <strong>Flink hackathon in
Stockholm</strong> (Oct 8-9) and a talk about Flink at the <strong>Stockholm
Hadoop User Group</strong> (Oct 8). This is followed by the very first
<strong>Flink Meetup in Berlin</strong> (Oct 15). In the US, there will be two
Flink Meetup talks: the first one at the <strong>Pasadena Big Data User
Group</strong> (Oct 29) and the second one at <strong>Si [...]
+
+</p>
+
+ <p><a href="/news/2014/10/03/upcoming_events.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2014/09/26/release-0.6.1.html">Apache Flink 0.6.1 available</a></h2>
@@ -305,6 +320,16 @@ academic and open source project that Flink originates
from.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html
index cc3765168..cf5d2a1c8 100644
--- a/content/blog/page3/index.html
+++ b/content/blog/page3/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a href="/2022/05/06/async-sink-base.html">The
Generic Asynchronous Base Sink</a></h2>
+
+ <p>06 May 2022
+ Zichen Liu </p>
+
+ <p>An overview of the new AsyncBaseSink and how to use it for building
your own concrete sink</p>
+
+ <p><a href="/2022/05/06/async-sink-base.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/2022/05/06/pyflink-1.15-thread-mode.html">Exploring the thread mode in
PyFlink</a></h2>
@@ -368,19 +381,6 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release
Announcement</a></h2>
-
- <p>17 Jan 2022
- Thomas Weise (<a href="https://twitter.com/thweise">@thweise</a>) &
Martijn Visser (<a
href="https://twitter.com/martijnvisser82">@martijnvisser82</a>)</p>
-
- <p>The Apache Flink community released the second bugfix version of the
Apache Flink 1.14 series.</p>
-
- <p><a href="/news/2022/01/17/release-1.14.3.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -413,6 +413,16 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html
index b480012b8..5d8909c67 100644
--- a/content/blog/page4/index.html
+++ b/content/blog/page4/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release
Announcement</a></h2>
+
+ <p>17 Jan 2022
+ Thomas Weise (<a href="https://twitter.com/thweise">@thweise</a>) &
Martijn Visser (<a
href="https://twitter.com/martijnvisser82">@martijnvisser82</a>)</p>
+
+ <p>The Apache Flink community released the second bugfix version of the
Apache Flink 1.14 series.</p>
+
+ <p><a href="/news/2022/01/17/release-1.14.3.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2022/01/07/release-ml-2.0.0.html">Apache Flink ML 2.0.0 Release
Announcement</a></h2>
@@ -361,21 +374,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2021/10/19/release-1.13.3.html">Apache Flink 1.13.3
Released</a></h2>
-
- <p>19 Oct 2021
- Chesnay Schepler </p>
-
- <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.13 series.</p>
-
-</p>
-
- <p><a href="/news/2021/10/19/release-1.13.3.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -408,6 +406,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html
index 1004fa02a..465b960b1 100644
--- a/content/blog/page5/index.html
+++ b/content/blog/page5/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2021/10/19/release-1.13.3.html">Apache Flink 1.13.3
Released</a></h2>
+
+ <p>19 Oct 2021
+ Chesnay Schepler </p>
+
+ <p><p>The Apache Flink community released the third bugfix version of
the Apache Flink 1.13 series.</p>
+
+</p>
+
+ <p><a href="/news/2021/10/19/release-1.13.3.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2021/09/29/release-1.14.0.html">Apache Flink 1.14.0 Release
Announcement</a></h2>
@@ -379,21 +394,6 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1
Released</a></h2>
-
- <p>28 May 2021
- Dawid Wysakowicz (<a
href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
-
- <p><p>The Apache Flink community released the first bugfix version of
the Apache Flink 1.13 series.</p>
-
-</p>
-
- <p><a href="/news/2021/05/28/release-1.13.1.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -426,6 +426,16 @@ This new release brings various improvements to the
StateFun runtime, a leaner w
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html
index 749f9f9dc..8cefc045a 100644
--- a/content/blog/page6/index.html
+++ b/content/blog/page6/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1
Released</a></h2>
+
+ <p>28 May 2021
+ Dawid Wysakowicz (<a
href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
+
+ <p><p>The Apache Flink community released the first bugfix version of
the Apache Flink 1.13 series.</p>
+
+</p>
+
+ <p><a href="/news/2021/05/28/release-1.13.1.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2021/05/21/release-1.12.4.html">Apache Flink 1.12.4
Released</a></h2>
@@ -369,21 +384,6 @@ to develop scalable, consistent, and elastic distributed
applications.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2021/01/19/release-1.12.1.html">Apache Flink 1.12.1
Released</a></h2>
-
- <p>19 Jan 2021
- Xintong Song </p>
-
- <p><p>The Apache Flink community released the first bugfix version of
the Apache Flink 1.12 series.</p>
-
-</p>
-
- <p><a href="/news/2021/01/19/release-1.12.1.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -416,6 +416,16 @@ to develop scalable, consistent, and elastic distributed
applications.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html
index 654db2141..7cf5f7f84 100644
--- a/content/blog/page7/index.html
+++ b/content/blog/page7/index.html
@@ -240,6 +240,21 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2021/01/19/release-1.12.1.html">Apache Flink 1.12.1
Released</a></h2>
+
+ <p>19 Jan 2021
+ Xintong Song </p>
+
+ <p><p>The Apache Flink community released the first bugfix version of
the Apache Flink 1.12 series.</p>
+
+</p>
+
+ <p><a href="/news/2021/01/19/release-1.12.1.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a href="/2021/01/18/rocksdb.html">Using RocksDB
State Backend in Apache Flink: When and How</a></h2>
@@ -363,19 +378,6 @@
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2020/10/13/stateful-serverless-internals.html">Stateful Functions
Internals: Behind the scenes of Stateful Serverless</a></h2>
-
- <p>13 Oct 2020
- Tzu-Li (Gordon) Tai (<a
href="https://twitter.com/tzulitai">@tzulitai</a>)</p>
-
- <p>This blog post dives deep into the internals of the StateFun runtime,
taking a look at how it enables consistent and fault-tolerant stateful
serverless applications.</p>
-
- <p><a
href="/news/2020/10/13/stateful-serverless-internals.html">Continue reading
»</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -408,6 +410,16 @@
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html
index 54f7e022b..9eb2f6a9d 100644
--- a/content/blog/page8/index.html
+++ b/content/blog/page8/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2020/10/13/stateful-serverless-internals.html">Stateful Functions
Internals: Behind the scenes of Stateful Serverless</a></h2>
+
+ <p>13 Oct 2020
+ Tzu-Li (Gordon) Tai (<a
href="https://twitter.com/tzulitai">@tzulitai</a>)</p>
+
+ <p>This blog post dives deep into the internals of the StateFun runtime,
taking a look at how it enables consistent and fault-tolerant stateful
serverless applications.</p>
+
+ <p><a
href="/news/2020/10/13/stateful-serverless-internals.html">Continue reading
»</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/news/2020/09/28/release-statefun-2.2.0.html">Stateful Functions 2.2.0
Release Announcement</a></h2>
@@ -369,19 +382,6 @@ as well as increased observability for operational
purposes.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced Flink Application
Patterns Vol.3: Custom Window Processing</a></h2>
-
- <p>30 Jul 2020
- Alexander Fedulov (<a
href="https://twitter.com/alex_fedulov">@alex_fedulov</a>)</p>
-
- <p>In this series of blog posts you will learn about powerful Flink
patterns for building streaming applications.</p>
-
- <p><a href="/news/2020/07/30/demo-fraud-detection-3.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -414,6 +414,16 @@ as well as increased observability for operational
purposes.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html
index ed887ed39..ce4793e62 100644
--- a/content/blog/page9/index.html
+++ b/content/blog/page9/index.html
@@ -240,6 +240,19 @@
<div class="col-sm-8">
<!-- Blog posts -->
+ <article>
+ <h2 class="blog-title"><a
href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced Flink Application
Patterns Vol.3: Custom Window Processing</a></h2>
+
+ <p>30 Jul 2020
+ Alexander Fedulov (<a
href="https://twitter.com/alex_fedulov">@alex_fedulov</a>)</p>
+
+ <p>In this series of blog posts you will learn about powerful Flink
patterns for building streaming applications.</p>
+
+ <p><a href="/news/2020/07/30/demo-fraud-detection-3.html">Continue
reading »</a></p>
+ </article>
+
+ <hr>
+
<article>
<h2 class="blog-title"><a
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink
SQL Demo: Building an End-to-End Streaming Application</a></h2>
@@ -375,21 +388,6 @@ and provide a tutorial for running Streaming ETL with
Flink on Zeppelin.</p>
<hr>
- <article>
- <h2 class="blog-title"><a
href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0
Release Announcement</a></h2>
-
- <p>09 Jun 2020
- Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
- <p><p>The Apache Flink community is happy to announce the release of
Stateful Functions (StateFun) 2.1.0! This release introduces new features
around state expiration and performance improvements for co-located
deployments, as well as other important changes that improve the stability and
testability of the project. As the community around StateFun grows, the release
cycle will follow this pattern of smaller and more frequent releases to
incorporate user feedback and allow for fast [...]
-
-</p>
-
- <p><a href="/news/2020/06/09/release-statefun-2.1.0.html">Continue
reading »</a></p>
- </article>
-
- <hr>
-
<!-- Pagination links -->
@@ -422,6 +420,16 @@ and provide a tutorial for running Streaming ETL with
Flink on Zeppelin.</p>
<ul id="markdown-toc">
+ <li><a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></li>
+
+
+
+
+
+
+
+
+
<li><a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></li>
diff --git a/content/index.html b/content/index.html
index 3e1937dfa..bd63af82d 100644
--- a/content/index.html
+++ b/content/index.html
@@ -405,6 +405,9 @@
<dl>
+ <dt> <a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></dt>
+ <dd>An overview of how to optimise the throughput of async sinks using
a custom RateLimitingStrategy</dd>
+
<dt> <a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></dt>
<dd><p>Apache Flink continues to grow at a rapid pace and is one of
the most active
communities in Apache. Flink 1.16 had over 240 contributors enthusiastically
participating,
@@ -422,11 +425,6 @@ with 19 FLIPs and 1100+ issues completed, bringing a lot
of exciting features to
<dt> <a href="/news/2022/09/28/release-1.14.6.html">Apache Flink
1.14.6 Release Announcement</a></dt>
<dd>The Apache Flink Community is pleased to announce another bug fix
release for Flink 1.14.</dd>
-
- <dt> <a href="/news/2022/09/08/akka-license-change.html">Regarding
Akka's licensing change</a></dt>
- <dd><p>On September 7th Lightbend announced a <a
href="https://www.lightbend.com/blog/why-we-are-changing-the-license-for-akka">license
change</a> for the Akka project, the TL;DR being that you will need a
commercial license to use future versions of Akka (2.7+) in production if you
exceed a certain revenue threshold.</p>
-
-</dd>
</dl>
diff --git a/content/zh/index.html b/content/zh/index.html
index bc20c0d3b..fe04cad0e 100644
--- a/content/zh/index.html
+++ b/content/zh/index.html
@@ -402,6 +402,9 @@
<dl>
+ <dt> <a
href="/2022/11/25/async-sink-rate-limiting-strategy.html">Optimising the
throughput of async sinks using a custom RateLimitingStrategy</a></dt>
+ <dd>An overview of how to optimise the throughput of async sinks using
a custom RateLimitingStrategy</dd>
+
<dt> <a href="/news/2022/10/28/1.16-announcement.html">Announcing the
Release of Apache Flink 1.16</a></dt>
<dd><p>Apache Flink continues to grow at a rapid pace and is one of
the most active
communities in Apache. Flink 1.16 had over 240 contributors enthusiastically
participating,
@@ -419,11 +422,6 @@ with 19 FLIPs and 1100+ issues completed, bringing a lot
of exciting features to
<dt> <a href="/news/2022/09/28/release-1.14.6.html">Apache Flink
1.14.6 Release Announcement</a></dt>
<dd>The Apache Flink Community is pleased to announce another bug fix
release for Flink 1.14.</dd>
-
- <dt> <a href="/news/2022/09/08/akka-license-change.html">Regarding
Akka's licensing change</a></dt>
- <dd><p>On September 7th Lightbend announced a <a
href="https://www.lightbend.com/blog/why-we-are-changing-the-license-for-akka">license
change</a> for the Akka project, the TL;DR being that you will need a
commercial license to use future versions of Akka (2.7+) in production if you
exceed a certain revenue threshold.</p>
-
-</dd>
</dl>