This is an automated email from the ASF dual-hosted git repository. pnowojski pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/flink-web.git
commit 207954c45c2e85b873921c063fa282e8880849eb Author: Piotr Nowojski <[email protected]> AuthorDate: Fri Jul 2 13:09:02 2021 +0200 Backpressure monitoring and analysis blog post --- _posts/2021-07-07-backpressure.md | 189 +++++++++++++ content/blog/feed.xml | 291 ++++++++++++++------- content/blog/index.html | 38 +-- content/blog/page10/index.html | 38 ++- content/blog/page11/index.html | 37 ++- content/blog/page12/index.html | 39 +-- content/blog/page13/index.html | 38 ++- content/blog/page14/index.html | 37 ++- content/blog/page15/index.html | 39 +-- content/blog/page16/index.html | 25 ++ content/blog/page2/index.html | 38 ++- content/blog/page3/index.html | 36 ++- content/blog/page4/index.html | 36 ++- content/blog/page5/index.html | 36 ++- content/blog/page6/index.html | 38 +-- content/blog/page7/index.html | 38 ++- content/blog/page8/index.html | 38 +-- content/blog/page9/index.html | 40 +-- content/index.html | 8 +- content/zh/index.html | 8 +- img/blog/2021-07-07-backpressure/animated.png | Bin 0 -> 847082 bytes .../2021-07-07-backpressure/bottleneck-zoom.png | Bin 0 -> 185048 bytes .../2021-07-07-backpressure/simple-example.png | Bin 0 -> 102308 bytes .../2021-07-07-backpressure/sliding-window.png | Bin 0 -> 22391 bytes .../2021-07-07-backpressure/source-task-busy.png | Bin 0 -> 25852 bytes img/blog/2021-07-07-backpressure/subtasks.png | Bin 0 -> 137756 bytes 26 files changed, 778 insertions(+), 309 deletions(-) diff --git a/_posts/2021-07-07-backpressure.md b/_posts/2021-07-07-backpressure.md new file mode 100644 index 0000000..b144d1b --- /dev/null +++ b/_posts/2021-07-07-backpressure.md @@ -0,0 +1,189 @@ +--- +layout: post +title: "How to identify the source of backpressure?" +date: 2021-07-07T00:00:00.000Z +authors: +- pnowojski: + name: "Piotr Nowojski" + twitter: "PiotrNowojski" +excerpt: Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysis of Flink Jobs. This blog post aims to introduce those changes and explain how to use them. +--- + +{% toc %} + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/animated.png" alt="Backpressure monitoring in the web UI"/> + <p class="align-center">Backpressure monitoring in the web UI</p> +</div> + +The backpressure topic was tackled from different angles over the last couple of years. However, when it comes +to identifying and analyzing sources of backpressure, things have changed quite a bit in the recent Flink releases +(especially with new additions to metrics and the web UI in Flink 1.13). This post will try to clarify some of +these changes and go into more detail about how to track down the source of backpressure, but first... + +## What is backpressure? + +This has been explained very well in an old, but still accurate, [post by Ufuk Celebi](https://www.ververica.com/blog/how-flink-handles-backpressure). +I highly recommend reading it if you are not familiar with this concept. For a much deeper and low-level understanding of +the topic and how Flink’s network stack works, there is a more [advanced explanation available here](https://alibabacloud.com/blog/analysis-of-network-flow-control-and-back-pressure-flink-advanced-tutorials_596632). + +At a high level, backpressure happens if some operator(s) in the Job Graph cannot process records at the +same rate as they are received. This fills up the input buffers of the subtask that is running this slow operator. +Once the input buffers are full, backpressure propagates to the output buffers of the upstream subtasks. +Once those are filled up, the upstream subtasks are also forced to slow down their records’ processing +rate to match the processing rate of the operator causing this bottleneck down the stream. Backpressure +further propagates up the stream until it reaches the source operators. + +As long as the load and available resources are static and none of the operators produce short bursts of +data (like windowing operators), those input/output buffers should only be in one of two states: almost empty +or almost full. If the downstream operator or subtask is able to keep up with the influx of data, the +buffers will be empty. If not, then the buffers will be full [<sup>1</sup>]. In fact, checking the buffers’ usage metrics +was the basis of the previously recommended way on how to detect and analyze backpressure described [a couple +of years back by Nico Kruber](https://flink.apache.org/2019/07/23/flink-network-stack-2.html#backpressure). +As I mentioned in the beginning, Flink now offers much better tools to do the same job, but before we get to that, +there are two questions worth asking. + +### Why should I care about backpressure? + +Backpressure is an indicator that your machines or operators are overloaded. The buildup of backpressure +directly affects the end-to-end latency of the system, as records are waiting longer in the queues before +being processed. Secondly, aligned checkpointing takes longer with backpressure, while unaligned checkpoints +will be larger (you can read more about aligned and unaligned checkpoints [in the documentation](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/concepts/stateful-stream-processing/#checkpointing). +If you are struggling with checkpoint barriers propagation times, taking care of backpressure would most +likely help to solve the problem. Lastly, you might just want to optimize your job in order to reduce +the costs of running the job. + +In order to address the problem for all cases, one needs to be aware of it, then locate and analyze it. + +### Why shouldn’t I care about backpressure? + +Frankly, you do not always have to care about the presence of backpressure. Almost by definition, lack +of backpressure means that your cluster is at least ever so slightly underutilized and over-provisioned. +If you want to minimize idling resources, you probably can not avoid incurring some backpressure. This +is especially true for batch processing. + +## How to detect and track down the source of backpressure? + +One way to detect backpressure is to use [metrics](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#system-metrics), +however, in Flink 1.13 it’s no longer necessary to dig so deep. In most cases, it should be enough to just +look at the job graph in the Web UI. + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/simple-example.png"/> +</div> + +The first thing to note in the example above is that different tasks have different colors. Those colors +represent a combination of two factors: under how much backpressure this task is and how busy it is. Idling +tasks will be blue, fully busy tasks will be red hot, and fully backpressured tasks will be black. Anything +in between will be a combination/shade of those three colors. With this knowledge, one can easily spot the +backpressured tasks (black). The busiest (red) task downstream of the backpressured tasks will most likely +be the source of the backpressure (the bottleneck). + +If you click on one particular task and go into the “BackPressure” tab you will be able to further dissect +the problem and check what is the busy/backpressured/idle status of every subtask in that task. For example, +this is especially handy if there is a data skew and not all subtasks are equally utilized. + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/subtasks.png" alt="Backpressure among subtasks"/> + <p class="align-center">Backpressure among subtasks</p> +</div> + +In the above example, we can clearly see which subtasks are idling, which are backpressured, and that +none of them are busy. And frankly, in a nutshell, that should be enough to quickly understand what is +happening with your Job :) However, there are a couple of more details worth explaining. + +### What are those numbers? + +If you are curious how it works underneath, we can go a little deeper. At the base of this new mechanism +we have three [new metrics](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#io) +that are exposed and calculated by each subtask: +- `idleTimeMsPerSecond` +- `busyTimeMsPerSecond` +- `backPressuredTimeMsPerSecond` +Each of them measures the average time in milliseconds per second that the subtask spent being idle, +busy, or backpressured respectively. Apart from some rounding errors they should complement each other and +add up to 1000ms/s. In essence, they are quite similar to, for example, CPU usage metrics. + +Another important detail is that they are being averaged over a short period of time (a couple of seconds) +and they take into account everything that is happening inside the subtask’s thread: operators, functions, +timers, checkpointing, records serialization/deserialization, network stack, and other Flink internal +overheads. A `WindowOperator` that is busy firing timers and producing results will be reported as busy or backpressured. +A function doing some expensive computation in `CheckpointedFunction#snapshotState` call, for instance flushing +internal buffers, will also be reported as busy. + +One limitation, however, is that `busyTimeMsPerSecond` and `idleTimeMsPerSecond` metrics are oblivious +to anything that is happening in separate threads, outside of the main subtask’s execution loop. +Fortunately, this is only relevant for two cases: +- Custom threads that you manually spawn in your operators (a discouraged practice). +- Old-style sources that implement the deprecated `SourceFunction` interface. Such sources will report `NaN`/`N/A` +as the value for busyTimeMsPerSecond. For more information on the topic of Data Sources please +[take a look here](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/sources/). + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/source-task-busy.png" alt="Old-style sources do not report busy time"/> + <p class="align-center">Old-style sources do not report busy time</p> +</div> + +In order to present those raw numbers in the web UI, those metrics need to be aggregated from all subtasks +(on the job graph we are showing only tasks). This is why the web UI presents the maximal value from all +subtasks of a given task and why the aggregated maximal values of busy and backpressured may not add up to 100%. +One subtask can be backpressured at 60%, while another can be busy at 60%. This can result in a task that +is both backpressured and busy at 60%. + +### Varying load + +There is one more thing. Do you remember that those metrics are measured and averaged over a couple of seconds? +Keep this in mind when analyzing jobs or tasks with varying load, such as (sub)tasks containing a `WindowOperator` +that is firing periodically. Both the subtask with a constant load of 50% and the subtask that alternates every +second between being fully busy and fully idle will be reporting the same value of `busyTimeMsPerSecond` +of 500ms/s. + +Furthermore, varying load and especially firing windows can move the bottleneck to a different place in +the job graph: + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/bottleneck-zoom.png" alt="Bottleneck alternating between two tasks"/> + <p class="align-center">Bottleneck alternating between two tasks</p> +</div> + +<div class="row front-graphic"> + <img src="{{ site.baseurl }}/img/blog/2021-07-07-backpressure/sliding-window.png" alt="SlidingWindowOperator"/> + <p class="align-center">SlidingWindowOperator</p> +</div> + +In this particular example, `SlidingWindowOperator` was the bottleneck as long as it was accumulating records. +However, as soon as it starts to fire its windows (once every 10 seconds), the downstream task +`SlidingWindowCheckMapper -> Sink: SlidingWindowCheckPrintSink` becomes the bottleneck and `SlidingWindowOperator` +gets backpressured. As those busy/backpressured/idle metrics are averaging time over a couple of seconds, +this subtlety is not immediately visible and has to be read between the lines. On top of that, the web UI +is updating its state only once every 10 seconds, which makes spotting more frequent changes a bit more difficult. + +## What can I do with backpressure? + +In general this is a complex topic that is worthy of a dedicated blog post. It was, to a certain extent, +addressed in [previous blog posts](https://flink.apache.org/2019/07/23/flink-network-stack-2.html#:~:text=this%20is%20unnecessary.-,What%20to%20do%20with%20Backpressure%3F,-Assuming%20that%20you). +In short, there are two high-level ways of dealing with backpressure. Either add more resources (more machines, +faster CPU, more RAM, better network, using SSDs…) or optimize usage of the resources you already have +(optimize the code, tune the configuration, avoid data skew). In either case, you first need to analyze +what is causing backpressure by: +1. Identifying the presence of backpressure. +2. Locating which subtask(s) or machines are causing it. +3. Digging deeper into what part of the code is causing it and which resource is scarce. + +Backpressure monitoring improvements and metrics can help you with the first two points. To tackle the +last one, profiling the code can be the way to go. To help with profiling, also starting from Flink 1.13, +[Flame Graphs](http://www.brendangregg.com/flamegraphs.html) are [integrated into Flink's web UI](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs/). +Flame Graphs is a well known profiling tool and visualization technique and I encourage you to give it a try. + +But keep in mind that after locating where the bottleneck is, you can analyze it the same way you would +any other non-distributed application (by checking resource utilization, attaching a profiler, etc). +Usually there is no silver bullet for problems like this. You can try to scale up but sometimes it might +not be easy or practical to do. + +Anyway... The aforementioned improvements to backpressure monitoring allow us to easily detect the source of backpressure, +and Flame Graphs can help us to analyze why a particular subtask is causing problems. Together those two +features should make the previously quite tedious process of debugging and performance analysis of Flink +jobs that much easier! Please upgrade to Flink 1.13.x and try them out! + +[<sup>1</sup>] There is a third possibility. In a rare case when network exchange is actually the bottleneck in your job, +the downstream task will have empty input buffers, while upstream output buffers will be full. <a class="anchor" id="1"></a> \ No newline at end of file diff --git a/content/blog/feed.xml b/content/blog/feed.xml index 8e79808..dcf63e1 100644 --- a/content/blog/feed.xml +++ b/content/blog/feed.xml @@ -7,6 +7,207 @@ <atom:link href="https://flink.apache.org/blog/feed.xml" rel="self" type="application/rss+xml" /> <item> +<title>How to identify the source of backpressure?</title> +<description><div class="page-toc"> +<ul id="markdown-toc"> + <li><a href="#what-is-backpressure" id="markdown-toc-what-is-backpressure">What is backpressure?</a> <ul> + <li><a href="#why-should-i-care-about-backpressure" id="markdown-toc-why-should-i-care-about-backpressure">Why should I care about backpressure?</a></li> + <li><a href="#why-shouldnt-i-care-about-backpressure" id="markdown-toc-why-shouldnt-i-care-about-backpressure">Why shouldn’t I care about backpressure?</a></li> + </ul> + </li> + <li><a href="#how-to-detect-and-track-down-the-source-of-backpressure" id="markdown-toc-how-to-detect-and-track-down-the-source-of-backpressure">How to detect and track down the source of backpressure?</a> <ul> + <li><a href="#what-are-those-numbers" id="markdown-toc-what-are-those-numbers">What are those numbers?</a></li> + <li><a href="#varying-load" id="markdown-toc-varying-load">Varying load</a></li> + </ul> + </li> + <li><a href="#what-can-i-do-with-backpressure" id="markdown-toc-what-can-i-do-with-backpressure">What can I do with backpressure?</a></li> +</ul> + +</div> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/animated.png" alt="Backpressure monitoring in the web UI" /> + <p class="align-center">Backpressure monitoring in the web UI</p> +</div> + +<p>The backpressure topic was tackled from different angles over the last couple of years. However, when it comes +to identifying and analyzing sources of backpressure, things have changed quite a bit in the recent Flink releases +(especially with new additions to metrics and the web UI in Flink 1.13). This post will try to clarify some of +these changes and go into more detail about how to track down the source of backpressure, but first…</p> + +<h2 id="what-is-backpressure">What is backpressure?</h2> + +<p>This has been explained very well in an old, but still accurate, <a href="https://www.ververica.com/blog/how-flink-handles-backpressure">post by Ufuk Celebi</a>. +I highly recommend reading it if you are not familiar with this concept. For a much deeper and low-level understanding of +the topic and how Flink’s network stack works, there is a more <a href="https://alibabacloud.com/blog/analysis-of-network-flow-control-and-back-pressure-flink-advanced-tutorials_596632">advanced explanation available here</a>.</p> + +<p>At a high level, backpressure happens if some operator(s) in the Job Graph cannot process records at the +same rate as they are received. This fills up the input buffers of the subtask that is running this slow operator. +Once the input buffers are full, backpressure propagates to the output buffers of the upstream subtasks. +Once those are filled up, the upstream subtasks are also forced to slow down their records’ processing +rate to match the processing rate of the operator causing this bottleneck down the stream. Backpressure +further propagates up the stream until it reaches the source operators.</p> + +<p>As long as the load and available resources are static and none of the operators produce short bursts of +data (like windowing operators), those input/output buffers should only be in one of two states: almost empty +or almost full. If the downstream operator or subtask is able to keep up with the influx of data, the +buffers will be empty. If not, then the buffers will be full [<sup>1</sup>]. In fact, checking the buffers’ usage metrics +was the basis of the previously recommended way on how to detect and analyze backpressure described <a href="https://flink.apache.org/2019/07/23/flink-network-stack-2.html#backpressure">a couple +of years back by Nico Kruber</a>. +As I mentioned in the beginning, Flink now offers much better tools to do the same job, but before we get to that, +there are two questions worth asking.</p> + +<h3 id="why-should-i-care-about-backpressure">Why should I care about backpressure?</h3> + +<p>Backpressure is an indicator that your machines or operators are overloaded. The buildup of backpressure +directly affects the end-to-end latency of the system, as records are waiting longer in the queues before +being processed. Secondly, aligned checkpointing takes longer with backpressure, while unaligned checkpoints +will be larger (you can read more about aligned and unaligned checkpoints <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/concepts/stateful-stream-processing/#checkpointing">in the documentation</a>. +If you are struggling with checkpoint barriers propagation times, taking care of backpressure would most +likely help to solve the problem. Lastly, you might just want to optimize your job in order to reduce +the costs of running the job.</p> + +<p>In order to address the problem for all cases, one needs to be aware of it, then locate and analyze it.</p> + +<h3 id="why-shouldnt-i-care-about-backpressure">Why shouldn’t I care about backpressure?</h3> + +<p>Frankly, you do not always have to care about the presence of backpressure. Almost by definition, lack +of backpressure means that your cluster is at least ever so slightly underutilized and over-provisioned. +If you want to minimize idling resources, you probably can not avoid incurring some backpressure. This +is especially true for batch processing.</p> + +<h2 id="how-to-detect-and-track-down-the-source-of-backpressure">How to detect and track down the source of backpressure?</h2> + +<p>One way to detect backpressure is to use <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#system-metrics">metrics</a>, +however, in Flink 1.13 it’s no longer necessary to dig so deep. In most cases, it should be enough to just +look at the job graph in the Web UI.</p> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/simple-example.png" /> +</div> + +<p>The first thing to note in the example above is that different tasks have different colors. Those colors +represent a combination of two factors: under how much backpressure this task is and how busy it is. Idling +tasks will be blue, fully busy tasks will be red hot, and fully backpressured tasks will be black. Anything +in between will be a combination/shade of those three colors. With this knowledge, one can easily spot the +backpressured tasks (black). The busiest (red) task downstream of the backpressured tasks will most likely +be the source of the backpressure (the bottleneck).</p> + +<p>If you click on one particular task and go into the “BackPressure” tab you will be able to further dissect +the problem and check what is the busy/backpressured/idle status of every subtask in that task. For example, +this is especially handy if there is a data skew and not all subtasks are equally utilized.</p> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/subtasks.png" alt="Backpressure among subtasks" /> + <p class="align-center">Backpressure among subtasks</p> +</div> + +<p>In the above example, we can clearly see which subtasks are idling, which are backpressured, and that +none of them are busy. And frankly, in a nutshell, that should be enough to quickly understand what is +happening with your Job :) However, there are a couple of more details worth explaining.</p> + +<h3 id="what-are-those-numbers">What are those numbers?</h3> + +<p>If you are curious how it works underneath, we can go a little deeper. At the base of this new mechanism +we have three <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#io">new metrics</a> +that are exposed and calculated by each subtask: +- <code>idleTimeMsPerSecond</code> +- <code>busyTimeMsPerSecond</code> +- <code>backPressuredTimeMsPerSecond</code> +Each of them measures the average time in milliseconds per second that the subtask spent being idle, +busy, or backpressured respectively. Apart from some rounding errors they should complement each other and +add up to 1000ms/s. In essence, they are quite similar to, for example, CPU usage metrics.</p> + +<p>Another important detail is that they are being averaged over a short period of time (a couple of seconds) +and they take into account everything that is happening inside the subtask’s thread: operators, functions, +timers, checkpointing, records serialization/deserialization, network stack, and other Flink internal +overheads. A <code>WindowOperator</code> that is busy firing timers and producing results will be reported as busy or backpressured. +A function doing some expensive computation in <code>CheckpointedFunction#snapshotState</code> call, for instance flushing +internal buffers, will also be reported as busy.</p> + +<p>One limitation, however, is that <code>busyTimeMsPerSecond</code> and <code>idleTimeMsPerSecond</code> metrics are oblivious +to anything that is happening in separate threads, outside of the main subtask’s execution loop. +Fortunately, this is only relevant for two cases: +- Custom threads that you manually spawn in your operators (a discouraged practice). +- Old-style sources that implement the deprecated <code>SourceFunction</code> interface. Such sources will report <code>NaN</code>/<code>N/A</code> +as the value for busyTimeMsPerSecond. For more information on the topic of Data Sources please +<a href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/sources/">take a look here</a>.</p> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/source-task-busy.png" alt="Old-style sources do not report busy time" /> + <p class="align-center">Old-style sources do not report busy time</p> +</div> + +<p>In order to present those raw numbers in the web UI, those metrics need to be aggregated from all subtasks +(on the job graph we are showing only tasks). This is why the web UI presents the maximal value from all +subtasks of a given task and why the aggregated maximal values of busy and backpressured may not add up to 100%. +One subtask can be backpressured at 60%, while another can be busy at 60%. This can result in a task that +is both backpressured and busy at 60%.</p> + +<h3 id="varying-load">Varying load</h3> + +<p>There is one more thing. Do you remember that those metrics are measured and averaged over a couple of seconds? +Keep this in mind when analyzing jobs or tasks with varying load, such as (sub)tasks containing a <code>WindowOperator</code> +that is firing periodically. Both the subtask with a constant load of 50% and the subtask that alternates every +second between being fully busy and fully idle will be reporting the same value of <code>busyTimeMsPerSecond</code> +of 500ms/s.</p> + +<p>Furthermore, varying load and especially firing windows can move the bottleneck to a different place in +the job graph:</p> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/bottleneck-zoom.png" alt="Bottleneck alternating between two tasks" /> + <p class="align-center">Bottleneck alternating between two tasks</p> +</div> + +<div class="row front-graphic"> + <img src="/img/blog/2021-07-07-backpressure/sliding-window.png" alt="SlidingWindowOperator" /> + <p class="align-center">SlidingWindowOperator</p> +</div> + +<p>In this particular example, <code>SlidingWindowOperator</code> was the bottleneck as long as it was accumulating records. +However, as soon as it starts to fire its windows (once every 10 seconds), the downstream task +<code>SlidingWindowCheckMapper -&gt; Sink: SlidingWindowCheckPrintSink</code> becomes the bottleneck and <code>SlidingWindowOperator</code> +gets backpressured. As those busy/backpressured/idle metrics are averaging time over a couple of seconds, +this subtlety is not immediately visible and has to be read between the lines. On top of that, the web UI +is updating its state only once every 10 seconds, which makes spotting more frequent changes a bit more difficult.</p> + +<h2 id="what-can-i-do-with-backpressure">What can I do with backpressure?</h2> + +<p>In general this is a complex topic that is worthy of a dedicated blog post. It was, to a certain extent, +addressed in <a href="https://flink.apache.org/2019/07/23/flink-network-stack-2.html#:~:text=this%20is%20unnecessary.-,What%20to%20do%20with%20Backpressure%3F,-Assuming%20that%20you">previous blog posts</a>. +In short, there are two high-level ways of dealing with backpressure. Either add more resources (more machines, +faster CPU, more RAM, better network, using SSDs…) or optimize usage of the resources you already have +(optimize the code, tune the configuration, avoid data skew). In either case, you first need to analyze +what is causing backpressure by: +1. Identifying the presence of backpressure. +2. Locating which subtask(s) or machines are causing it. +3. Digging deeper into what part of the code is causing it and which resource is scarce.</p> + +<p>Backpressure monitoring improvements and metrics can help you with the first two points. To tackle the +last one, profiling the code can be the way to go. To help with profiling, also starting from Flink 1.13, +<a href="http://www.brendangregg.com/flamegraphs.html">Flame Graphs</a> are <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs/">integrated into Flink’s web UI</a>. +Flame Graphs is a well known profiling tool and visualization technique and I encourage you to give it a try.</p> + +<p>But keep in mind that after locating where the bottleneck is, you can analyze it the same way you would +any other non-distributed application (by checking resource utilization, attaching a profiler, etc). +Usually there is no silver bullet for problems like this. You can try to scale up but sometimes it might +not be easy or practical to do.</p> + +<p>Anyway… The aforementioned improvements to backpressure monitoring allow us to easily detect the source of backpressure, +and Flame Graphs can help us to analyze why a particular subtask is causing problems. Together those two +features should make the previously quite tedious process of debugging and performance analysis of Flink +jobs that much easier! Please upgrade to Flink 1.13.x and try them out!</p> + +<p>[<sup>1</sup>] There is a third possibility. In a rare case when network exchange is actually the bottleneck in your job, +the downstream task will have empty input buffers, while upstream output buffers will be full. <a class="anchor" id="1"></a></p> +</description> +<pubDate>Wed, 07 Jul 2021 02:00:00 +0200</pubDate> +<link>https://flink.apache.org/2021/07/07/backpressure.html</link> +<guid isPermaLink="true">/2021/07/07/backpressure.html</guid> +</item> + +<item> <title>Apache Flink 1.13.1 Released</title> <description><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.13 series.</p> @@ -19040,95 +19241,5 @@ Feedback through the Flink <a href="http://flink.apache.org/community.ht <guid isPermaLink="true">/news/2018/02/15/release-1.4.1.html</guid> </item> -<item> -<title>Managing Large State in Apache Flink: An Intro to Incremental Checkpointing</title> -<description><p>Apache Flink was purpose-built for <em>stateful</em> stream processing. However, what is state in a stream processing application? I defined state and stateful stream processing in a <a href="http://flink.apache.org/features/2017/07/04/flink-rescalable-state.html">previous blog post</a>, and in case you need a refresher, <em>state is defined as memory in an application’s operators that stores information about previously-seen [...] - -<p>State is a fundamental, enabling concept in stream processing required for a majority of complex use cases. Some examples highlighted in the <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html">Flink documentation</a>:</p> - -<ul> - <li>When an application searches for certain event patterns, the state stores the sequence of events encountered so far.</li> - <li>When aggregating events per minute, the state holds the pending aggregates.</li> - <li>When training a machine learning model over a stream of data points, the state holds the current version of the model parameters.</li> -</ul> - -<p>However, stateful stream processing is only useful in production environments if the state is fault tolerant. “Fault tolerance” means that even if there’s a software or machine failure, the computed end-result is accurate, with no data loss or double-counting of events.</p> - -<p>Flink’s fault tolerance has always been a powerful and popular feature, minimizing the impact of software or machine failure on your business and making it possible to guarantee exactly-once results from a Flink application.</p> - -<p>Core to this is <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html">checkpointing</a>, which is the mechanism Flink uses to make application state fault tolerant. A checkpoint in Flink is a global, asynchronous snapshot of application state that’s taken on a regular interval and sent to durable storage (usually, a distributed file system). In the event of a failure, Flink restarts an application using the most [...] - -<p>Before incremental checkpointing, every single Flink checkpoint consisted of the full state of an application. We created the incremental checkpointing feature after we noticed that writing the full state for every checkpoint was often unnecessary, as the state changes from one checkpoint to the next were rarely that large. Incremental checkpointing instead maintains the differences (or ‘delta’) between each checkpoint and stores only the differences between the last checkpoint [...] - -<p>Incremental checkpoints can provide a significant performance improvement for jobs with a very large state. Early testing of the feature by a production user with terabytes of state shows a drop in checkpoint time from more than 3 minutes down to 30 seconds after implementing incremental checkpoints. This is because the checkpoint doesn’t need to transfer the full state to durable storage on each checkpoint.</p> - -<h3 id="how-to-start">How to Start</h3> - -<p>Currently, you can only use incremental checkpointing with a RocksDB state back-end, and Flink uses RocksDB’s internal backup mechanism to consolidate checkpoint data over time. As a result, the incremental checkpoint history in Flink does not grow indefinitely, and Flink eventually consumes and prunes old checkpoints automatically.</p> - -<p>To enable incremental checkpointing in your application, I recommend you read the <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/large_state_tuning.html#tuning-rocksdb">the Apache Flink documentation on checkpointing</a> for full details, but in summary, you enable checkpointing as normal, but enable incremental checkpointing in the constructor by setting the second parameter to <code>true</code>.</p> - -<h4 id="java-example">Java Example</h4> - -<div class="highlight"><pre><code class="language-java"><span class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</span><span class="o">();</span> -<span class="n">env</span><span class="o">.</span><span class="na">setStateBackend</span><span class="o">(</span><span class="k">new</span> <span class="nf">RocksDBStateBackend</span><span class="o">(</span><span class="n">filebackend</span><span class="o">,</span> <span class="kc" [...] - -<h4 id="scala-example">Scala Example</h4> - -<div class="highlight"><pre><code class="language-scala"><span class="k">val</span> <span class="n">env</span> <span class="k">=</span> <span class="nc">StreamExecutionEnvironment</span><span class="o">.</span><span class="n">getExecutionEnvironment</span><span class="o">()</span> -<span class="n">env</span><span class="o">.</span><span class="n">setStateBackend</span><span class="o">(</span><span class="k">new</span> <span class="nc">RocksDBStateBackend</span><span class="o">(</span><span class="n">filebackend</span><span class="o">,</span> <span class="kc" [...] - -<p>By default, Flink retains 1 completed checkpoint, so if you need a higher number, <a href="https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/checkpointing.html#related-config-options">you can configure it with the following flag</a>:</p> - -<div class="highlight"><pre><code class="language-java"><span class="n">state</span><span class="o">.</span><span class="na">checkpoints</span><span class="o">.</span><span class="na">num</span><span class="o">-</span><span class="n">retained</span></code></pre></div> - -<h3 id="how-it-works">How it Works</h3> - -<p>Flink’s incremental checkpointing uses <a href="https://github.com/facebook/rocksdb/wiki/Checkpoints">RocksDB checkpoints</a> as a foundation. RocksDB is a key-value store based on ‘<a href="https://en.wikipedia.org/wiki/Log-structured_merge-tree">log-structured-merge</a>’ (LSM) trees that collects all changes in a mutable (changeable) in-memory buffer called a ‘memtable’. Any updates to the same key in the memtable replace previous va [...] - -<p>A ‘compaction’ background task merges sstables to consolidate potential duplicates for each key, and over time RocksDB deletes the original sstables, with the merged sstable containing all information from across all the other sstables.</p> - -<p>On top of this, Flink tracks which sstable files RocksDB has created and deleted since the previous checkpoint, and as the sstables are immutable, Flink uses this to figure out the state changes. To do this, Flink triggers a flush in RocksDB, forcing all memtables into sstables on disk, and hard-linked in a local temporary directory. This process is synchronous to the processing pipeline, and Flink performs all further steps asynchronously and does not block processing.</p> - -<p>Then Flink copies all new sstables to stable storage (e.g., HDFS, S3) to reference in the new checkpoint. Flink doesn’t copy all sstables that already existed in the previous checkpoint to stable storage but re-reference them. Any new checkpoints will no longer reference deleted files as deleted sstables in RocksDB are always the result of compaction, and it eventually replaces old tables with an sstable that is the result of a merge. This how in Flink’s incremental checkpoints [...] - -<p>For tracking changes between checkpoints, the uploading of consolidated tables is redundant work. Flink performs the process incrementally, and typically adds only a small overhead, so we consider this worthwhile because it allows Flink to keep a shorter history of checkpoints to consider in a recovery.</p> - -<h4 id="an-example">An Example</h4> - -<p><img src="/img/blog/incremental_cp_impl_example.svg" alt="Example setup" /> -<em>Example setup</em></p> - -<p>Take an example with a subtask of one operator that has a keyed state, and the number of retained checkpoints set at <strong>2</strong>. The columns in the figure above show the state of the local RocksDB instance for each checkpoint, the files it references, and the counts in the shared state registry after the checkpoint completes.</p> - -<p>For checkpoint ‘CP 1’, the local RocksDB directory contains two sstable files, it considers these new and uploads them to stable storage using directory names that match the checkpoint name. When the checkpoint completes, Flink creates the two entries in the shared state registry and sets their counts to ‘1’. The key in the shared state registry is a composite of an operator, subtask, and the original sstable file name. The registry also keeps a mapping from the key to the file [...] - -<p>For checkpoint ‘CP 2’, RocksDB has created two new sstable files, and the two older ones still exist. For checkpoint ‘CP 2’, Flink adds the two new files to stable storage and can reference the previous two files. When the checkpoint completes, Flink increases the counts for all referenced files by 1.</p> - -<p>For checkpoint ‘CP 3’, RocksDB’s compaction has merged <code>sstable-(1)</code>, <code>sstable-(2)</code>, and <code>sstable-(3)</code> into <code>sstable-(1,2,3)</code> and deleted the original files. This merged file contains the same information as the source files, with all duplicate entries eliminated. In addition to this merged file, <code>sstable-(4)</code> still exists and there is now a new <code>sstable- [...] - -<p>For checkpoint ‘CP-4’, RocksDB has merged <code>sstable-(4)</code>, <code>sstable-(5)</code>, and a new <code>sstable-(6)</code> into <code>sstable-(4,5,6)</code>. Flink adds this new table to stable storage and references it together with <code>sstable-(1,2,3)</code>, it increases the counts for <code>sstable-(1,2,3)</code> and <code>sstable-(4,5,6)</code> by 1 and then deletes ‘CP-2’ as the num [...] - -<h3 id="race-conditions-and-concurrent-checkpoints">Race Conditions and Concurrent Checkpoints</h3> - -<p>As Flink can execute multiple checkpoints in parallel, sometimes new checkpoints start before confirming previous checkpoints as completed. Because of this, you should consider which the previous checkpoint to use as a basis for a new incremental checkpoint. Flink only references state from a checkpoint confirmed by the checkpoint coordinator so that it doesn’t unintentionally reference a deleted shared file.</p> - -<h3 id="restoring-checkpoints-and-performance-considerations">Restoring Checkpoints and Performance Considerations</h3> - -<p>If you enable incremental checkpointing, there are no further configuration steps needed to recover your state in case of failure. If a failure occurs, Flink’s <code>JobManager</code> tells all tasks to restore from the last completed checkpoint, be it a full or incremental checkpoint. Each <code>TaskManager</code> then downloads their share of the state from the checkpoint on the distributed file system.</p> - -<p>Though the feature can lead to a substantial improvement in checkpoint time for users with a large state, there are trade-offs to consider with incremental checkpointing. Overall, the process reduces the checkpointing time during normal operations but can lead to a longer recovery time depending on the size of your state. If the cluster failure is particularly severe and the Flink <code>TaskManager</code>s have to read from multiple checkpoints, recovery can be a slo [...] - -<p>There are some strategies for improving the convenience/performance trade-off, and I recommend you read <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/checkpoints.html#basics-of-incremental-checkpoints">the Flink documentation</a> for more details.</p> - -<p><em>This post <a href="https://data-artisans.com/blog/managing-large-state-apache-flink-incremental-checkpointing-overview" target="_blank"> originally appeared on the data Artisans blog </a>and was contributed to the Flink blog by Stefan Richter and Chris Ward.</em></p> -<link rel="canonical" href="https://data-artisans.com/blog/managing-large-state-apache-flink-incremental-checkpointing-overview" /> - -</description> -<pubDate>Tue, 30 Jan 2018 13:00:00 +0100</pubDate> -<link>https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html</link> -<guid isPermaLink="true">/features/2018/01/30/incremental-checkpointing.html</guid> -</item> - </channel> </rss> diff --git a/content/blog/index.html b/content/blog/index.html index b944645..2531f3e 100644 --- a/content/blog/index.html +++ b/content/blog/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></h2> + + <p>07 Jul 2021 + Piotr Nowojski (<a href="https://twitter.com/PiotrNowojski">@PiotrNowojski</a>)</p> + + <p>Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysing of Flink Jobs. This blog post aims to introduce those changes and explain how to use them.</p> + + <p><a href="/2021/07/07/backpressure.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></h2> <p>28 May 2021 @@ -329,21 +342,6 @@ to develop scalable, consistent, and elastic distributed applications.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3 Released</a></h2> - - <p>29 Jan 2021 - Xintong Song </p> - - <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.10 series.</p> - -</p> - - <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -376,6 +374,16 @@ to develop scalable, consistent, and elastic distributed applications.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html index f6f6c19..b3253dc 100644 --- a/content/blog/page10/index.html +++ b/content/blog/page10/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2> + + <p>20 Sep 2018 + </p> + + <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.5 series.</p> + +</p> + + <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2> <p>21 Aug 2018 @@ -333,19 +348,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State in Apache Flink: An Intro to Incremental Checkpointing</a></h2> - - <p>30 Jan 2018 - Stefan Ricther (<a href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p> - - <p>Flink 1.3.0 introduced incremental checkpointing, making it possible for applications with large state to generate checkpoints more efficiently.</p> - - <p><a href="/features/2018/01/30/incremental-checkpointing.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -378,6 +380,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html index d7914bb..8b86f5f 100644 --- a/content/blog/page11/index.html +++ b/content/blog/page11/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State in Apache Flink: An Intro to Incremental Checkpointing</a></h2> + + <p>30 Jan 2018 + Stefan Ricther (<a href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p> + + <p>Flink 1.3.0 introduced incremental checkpointing, making it possible for applications with large state to generate checkpoints more efficiently.</p> + + <p><a href="/features/2018/01/30/incremental-checkpointing.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in Review</a></h2> <p>21 Dec 2017 @@ -336,20 +349,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <hr> - <article> - <h2 class="blog-title"><a href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic Tables</a></h2> - - <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang - </p> - - <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs for stream and batch processing, meaning that a query produces the same result when being evaluated on streaming or static data.</p> -<p>In this blog post we discuss the future of these APIs and introduce the concept of Dynamic Tables. Dynamic tables will significantly expand the scope of the Table API and SQL on streams and enable many more advanced use cases. We discuss how streams and dynamic tables relate to each other and explain the semantics of continuously evaluating queries on dynamic tables.</p></p> - - <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -382,6 +381,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html index b4e09e1..e70b8b2 100644 --- a/content/blog/page12/index.html +++ b/content/blog/page12/index.html @@ -201,6 +201,20 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic Tables</a></h2> + + <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang + </p> + + <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs for stream and batch processing, meaning that a query produces the same result when being evaluated on streaming or static data.</p> +<p>In this blog post we discuss the future of these APIs and introduce the concept of Dynamic Tables. Dynamic tables will significantly expand the scope of the Table API and SQL on streams and enable many more advanced use cases. We discuss how streams and dynamic tables relate to each other and explain the semantics of continuously evaluating queries on dynamic tables.</p></p> + + <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and Back Again: An Update on Flink's Table & SQL API</a></h2> <p>29 Mar 2017 by Timo Walther (<a href="https://twitter.com/">@twalthr</a>) @@ -329,21 +343,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 1.1.0</a></h2> - - <p>08 Aug 2016 - </p> - - <p><div class="alert alert-success"><strong>Important</strong>: The Maven artifacts published with version 1.1.0 on Maven central have a Hadoop dependency issue. It is highly recommended to use <strong>1.1.1</strong> or <strong>1.1.1-hadoop1</strong> as the Flink version.</div> - -</p> - - <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -376,6 +375,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html index ed7dadd..fae12ab 100644 --- a/content/blog/page13/index.html +++ b/content/blog/page13/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 1.1.0</a></h2> + + <p>08 Aug 2016 + </p> + + <p><div class="alert alert-success"><strong>Important</strong>: The Maven artifacts published with version 1.1.0 on Maven central have a Hadoop dependency issue. It is highly recommended to use <strong>1.1.1</strong> or <strong>1.1.1-hadoop1</strong> as the Flink version.</div> + +</p> + + <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream Processing for Everyone with SQL and Apache Flink</a></h2> <p>24 May 2016 by Fabian Hueske (<a href="https://twitter.com/">@fhueske</a>) @@ -330,19 +345,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache Flink: How to run existing Storm topologies on Flink</a></h2> - - <p>11 Dec 2015 by Matthias J. Sax (<a href="https://twitter.com/">@MatthiasJSax</a>) - </p> - - <p>In this blog post, we describe Flink's compatibility package for <a href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts (sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, the compatibility package provides a Storm compatible API in order to execute whole Storm topologies with (almost) no code adaption.</p> - - <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -375,6 +377,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html index 992d112..6d2893a 100644 --- a/content/blog/page14/index.html +++ b/content/blog/page14/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache Flink: How to run existing Storm topologies on Flink</a></h2> + + <p>11 Dec 2015 by Matthias J. Sax (<a href="https://twitter.com/">@MatthiasJSax</a>) + </p> + + <p>In this blog post, we describe Flink's compatibility package for <a href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts (sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, the compatibility package provides a Storm compatible API in order to execute whole Storm topologies with (almost) no code adaption.</p> + + <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in Apache Flink</a></h2> <p>04 Dec 2015 by Fabian Hueske (<a href="https://twitter.com/">@fhueske</a>) @@ -338,20 +351,6 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits and Bytes</a></h2> - - <p>11 May 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>) - </p> - - <p><p>Nowadays, a lot of open-source systems for analyzing large data sets are implemented in Java or other JVM-based programming languages. The most well-known example is Apache Hadoop, but also newer frameworks such as Apache Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that JVM-based data analysis engines face is to store large amounts of data in memory - both for caching and for efficient processing such as sorting and joining of data. Managing the [...] -<p>In this blog post we discuss how Apache Flink manages memory, talk about its custom data de/serialization stack, and show how it operates on binary data.</p></p> - - <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -384,6 +383,16 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page15/index.html b/content/blog/page15/index.html index 79b69bf..1a821be 100644 --- a/content/blog/page15/index.html +++ b/content/blog/page15/index.html @@ -201,6 +201,20 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits and Bytes</a></h2> + + <p>11 May 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>) + </p> + + <p><p>Nowadays, a lot of open-source systems for analyzing large data sets are implemented in Java or other JVM-based programming languages. The most well-known example is Apache Hadoop, but also newer frameworks such as Apache Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that JVM-based data analysis engines face is to store large amounts of data in memory - both for caching and for efficient processing such as sorting and joining of data. Managing the [...] +<p>In this blog post we discuss how Apache Flink manages memory, talk about its custom data de/serialization stack, and show how it operates on binary data.</p></p> + + <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink 0.9.0-milestone1 preview release</a></h2> <p>13 Apr 2015 @@ -345,21 +359,6 @@ and offers a new API including definition of flexible windows.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2> - - <p>04 Nov 2014 - </p> - - <p><p>We are pleased to announce the availability of Flink 0.7.0. This release includes new user-facing features as well as performance and bug fixes, brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total of 34 people have contributed to this release, a big thanks to all of them!</p> - -</p> - - <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -392,6 +391,16 @@ and offers a new API including definition of flexible windows.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page16/index.html b/content/blog/page16/index.html index 5dce80c..03c1059 100644 --- a/content/blog/page16/index.html +++ b/content/blog/page16/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2> + + <p>04 Nov 2014 + </p> + + <p><p>We are pleased to announce the availability of Flink 0.7.0. This release includes new user-facing features as well as performance and bug fixes, brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total of 34 people have contributed to this release, a big thanks to all of them!</p> + +</p> + + <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2> <p>03 Oct 2014 @@ -280,6 +295,16 @@ academic and open source project that Flink originates from.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html index 091a3d3..5a770ab 100644 --- a/content/blog/page2/index.html +++ b/content/blog/page2/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3 Released</a></h2> + + <p>29 Jan 2021 + Xintong Song </p> + + <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.10 series.</p> + +</p> + + <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/01/19/release-1.12.1.html">Apache Flink 1.12.1 Released</a></h2> <p>19 Jan 2021 @@ -325,19 +340,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure</a></h2> - - <p>15 Oct 2020 - Arvid Heise & Stephan Ewen </p> - - <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both very small and extremely large scenarios and provides support for many operational features. In this post we recap the original checkpointing process in Flink, its core properties and issues under backpressure.</p> - - <p><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -370,6 +372,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html index 28673c4..1a2c858 100644 --- a/content/blog/page3/index.html +++ b/content/blog/page3/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure</a></h2> + + <p>15 Oct 2020 + Arvid Heise & Stephan Ewen </p> + + <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both very small and extremely large scenarios and provides support for many operational features. In this post we recap the original checkpointing process in Flink, its core properties and issues under backpressure.</p> + + <p><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/10/13/stateful-serverless-internals.html">Stateful Functions Internals: Behind the scenes of Stateful Serverless</a></h2> <p>13 Oct 2020 @@ -329,19 +342,6 @@ as well as increased observability for operational purposes.</p> <hr> - <article> - <h2 class="blog-title"><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The integration of Pandas into PyFlink</a></h2> - - <p>04 Aug 2020 - Jincheng Sun (<a href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p> - - <p>The Apache Flink community put some great effort into integrating Pandas with PyFlink in the latest Flink version 1.11. Some of the added features include support for Pandas UDF and the conversion between Pandas DataFrame and Table. In this article, we will introduce how these functionalities work and how to use them with a step-by-step example.</p> - - <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -374,6 +374,16 @@ as well as increased observability for operational purposes.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html index fd0830c..636dbc3 100644 --- a/content/blog/page4/index.html +++ b/content/blog/page4/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The integration of Pandas into PyFlink</a></h2> + + <p>04 Aug 2020 + Jincheng Sun (<a href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p> + + <p>The Apache Flink community put some great effort into integrating Pandas with PyFlink in the latest Flink version 1.11. Some of the added features include support for Pandas UDF and the conversion between Pandas DataFrame and Table. In this article, we will introduce how these functionalities work and how to use them with a step-by-step example.</p> + + <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced Flink Application Patterns Vol.3: Custom Window Processing</a></h2> <p>30 Jul 2020 @@ -335,19 +348,6 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/06/11/community-update.html">Flink Community Update - June'20</a></h2> - - <p>11 Jun 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.</p> - - <p><a href="/news/2020/06/11/community-update.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -380,6 +380,16 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html index fbede13..2246b0f 100644 --- a/content/blog/page5/index.html +++ b/content/blog/page5/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/06/11/community-update.html">Flink Community Update - June'20</a></h2> + + <p>11 Jun 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.</p> + + <p><a href="/news/2020/06/11/community-update.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0 Release Announcement</a></h2> <p>09 Jun 2020 @@ -326,19 +339,6 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/04/01/community-update.html">Flink Community Update - April'20</a></h2> - - <p>01 Apr 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>While things slow down around us, the Apache Flink community is privileged to remain as active as ever. This blogpost combs through the past few months to give you an update on the state of things in Flink — from core releases to Stateful Functions; from some good old community stats to a new development blog.</p> - - <p><a href="/news/2020/04/01/community-update.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -371,6 +371,16 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html index bb47949..735cada 100644 --- a/content/blog/page6/index.html +++ b/content/blog/page6/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/04/01/community-update.html">Flink Community Update - April'20</a></h2> + + <p>01 Apr 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>While things slow down around us, the Apache Flink community is privileged to remain as active as ever. This blogpost combs through the past few months to give you an update on the state of things in Flink — from core releases to Stateful Functions; from some good old community stats to a new development blog.</p> + + <p><a href="/news/2020/04/01/community-update.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2> <p>27 Mar 2020 @@ -323,21 +336,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2> - - <p>11 Dec 2019 - Hequn Cheng </p> - - <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.8 series.</p> - -</p> - - <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -370,6 +368,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html index 0ad6d47..698af1e 100644 --- a/content/blog/page7/index.html +++ b/content/blog/page7/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2> + + <p>11 Dec 2019 + Hequn Cheng </p> + + <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.8 series.</p> + +</p> + + <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on Kubernetes with KUDO</a></h2> <p>09 Dec 2019 @@ -326,19 +341,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A Practical Guide to Broadcast State in Apache Flink</a></h2> - - <p>26 Jun 2019 - Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> - - <p>Apache Flink has multiple types of operator state, one of which is called Broadcast State. In this post, we explain what Broadcast State is, and show an example of how it can be applied to an application that evaluates dynamic patterns on an event stream.</p> - - <p><a href="/2019/06/26/broadcast-state.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -371,6 +373,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html index 70f9c98..1cd63ff 100644 --- a/content/blog/page8/index.html +++ b/content/blog/page8/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A Practical Guide to Broadcast State in Apache Flink</a></h2> + + <p>26 Jun 2019 + Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> + + <p>Apache Flink has multiple types of operator state, one of which is called Broadcast State. In this post, we explain what Broadcast State is, and show an example of how it can be applied to an application that evaluates dynamic patterns on an event stream.</p> + + <p><a href="/2019/06/26/broadcast-state.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A Deep-Dive into Flink's Network Stack</a></h2> <p>05 Jun 2019 @@ -325,21 +338,6 @@ for more details.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2> - - <p>25 Feb 2019 - </p> - - <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.</p> - -</p> - - <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -372,6 +370,16 @@ for more details.</p> <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html index 14d2492..7bed33b 100644 --- a/content/blog/page9/index.html +++ b/content/blog/page9/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2> + + <p>25 Feb 2019 + </p> + + <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.</p> + +</p> + + <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2> <p>15 Feb 2019 @@ -335,21 +350,6 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <hr> - <article> - <h2 class="blog-title"><a href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2> - - <p>20 Sep 2018 - </p> - - <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.5 series.</p> - -</p> - - <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -382,6 +382,16 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <ul id="markdown-toc"> + <li><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></li> + + + + + + + + + <li><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></li> diff --git a/content/index.html b/content/index.html index e356c66..fbc42c6 100644 --- a/content/index.html +++ b/content/index.html @@ -577,6 +577,9 @@ <dl> + <dt> <a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></dt> + <dd>Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysing of Flink Jobs. This blog post aims to introduce those changes and explain how to use them.</dd> + <dt> <a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></dt> <dd><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.13 series.</p> @@ -592,11 +595,6 @@ <dt> <a href="/news/2021/05/03/release-1.13.0.html">Apache Flink 1.13.0 Release Announcement</a></dt> <dd>The Apache Flink community is excited to announce the release of Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring significant improvements to usability and observability as well as new features that improve the elasticity of Flink's Application-style deployments.</dd> - - <dt> <a href="/news/2021/04/29/release-1.12.3.html">Apache Flink 1.12.3 Released</a></dt> - <dd><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.</p> - -</dd> </dl> diff --git a/content/zh/index.html b/content/zh/index.html index fece035..3296781 100644 --- a/content/zh/index.html +++ b/content/zh/index.html @@ -570,6 +570,9 @@ <dl> + <dt> <a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></dt> + <dd>Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysing of Flink Jobs. This blog post aims to introduce those changes and explain how to use them.</dd> + <dt> <a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></dt> <dd><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.13 series.</p> @@ -585,11 +588,6 @@ <dt> <a href="/news/2021/05/03/release-1.13.0.html">Apache Flink 1.13.0 Release Announcement</a></dt> <dd>The Apache Flink community is excited to announce the release of Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring significant improvements to usability and observability as well as new features that improve the elasticity of Flink's Application-style deployments.</dd> - - <dt> <a href="/news/2021/04/29/release-1.12.3.html">Apache Flink 1.12.3 Released</a></dt> - <dd><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.</p> - -</dd> </dl> diff --git a/img/blog/2021-07-07-backpressure/animated.png b/img/blog/2021-07-07-backpressure/animated.png new file mode 100644 index 0000000..40ea809 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/animated.png differ diff --git a/img/blog/2021-07-07-backpressure/bottleneck-zoom.png b/img/blog/2021-07-07-backpressure/bottleneck-zoom.png new file mode 100644 index 0000000..b4e5b80 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/bottleneck-zoom.png differ diff --git a/img/blog/2021-07-07-backpressure/simple-example.png b/img/blog/2021-07-07-backpressure/simple-example.png new file mode 100644 index 0000000..fceccb3 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/simple-example.png differ diff --git a/img/blog/2021-07-07-backpressure/sliding-window.png b/img/blog/2021-07-07-backpressure/sliding-window.png new file mode 100644 index 0000000..dcd3240 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/sliding-window.png differ diff --git a/img/blog/2021-07-07-backpressure/source-task-busy.png b/img/blog/2021-07-07-backpressure/source-task-busy.png new file mode 100644 index 0000000..8f72b54 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/source-task-busy.png differ diff --git a/img/blog/2021-07-07-backpressure/subtasks.png b/img/blog/2021-07-07-backpressure/subtasks.png new file mode 100644 index 0000000..bf6ede6 Binary files /dev/null and b/img/blog/2021-07-07-backpressure/subtasks.png differ
