[flink-web] 01/02: Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"

trohrmann Tue, 04 Jan 2022 05:58:58 -0800

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git


commit 6ffe80f166cfef73ae98a2242b1e7a3df8a52bf0
Author: Thesharing <[email protected]>
AuthorDate: Wed Dec 29 20:47:53 2021 +0800

    Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"
    
    This closes #494.
---
 .../2022-01-04-scheduler-performance-part-one.md   |  76 +++++++++++
 .../2022-01-04-scheduler-performance-part-two.md   | 148 +++++++++++++++++++++
 .../1-distribution-pattern.svg                     |   4 +
 .../2022-01-05-scheduler-performance/2-groups.svg  |   4 +
 .../3-how-shuffle-descriptors-are-distributed.svg  |   4 +
 .../4-pipelined-region.svg                         |   4 +
 .../5-scheduling-deadlock.svg                      |   4 +
 .../6-building-pipelined-region.svg                |   4 +
 8 files changed, 248 insertions(+)

diff --git a/_posts/2022-01-04-scheduler-performance-part-one.md 
b/_posts/2022-01-04-scheduler-performance-part-one.md
new file mode 100644
index 0000000..97404a5
--- /dev/null
+++ b/_posts/2022-01-04-scheduler-performance-part-one.md
@@ -0,0 +1,76 @@
+---
+layout: post
+title: "How We Improved Scheduler Performance for Large-scale Jobs - Part One"
+date: 2022-01-04T08:00:00.000Z
+authors:
+- Zhilong Hong:
+  name: "Zhilong Hong"
+- Zhu Zhu:
+  name: "Zhu Zhu"
+- DaisyTsang:
+  name: "Daisy Tsang"
+- Till Rohrmann:
+  name: "Till Rohrmann"
+  twitter: "stsffap"
+
+excerpt: To improve the performance of the scheduler for large-scale jobs, 
several optimizations were introduced in Flink 1.13 and 1.14. In this blog post 
we'll take a look at them.
+---
+
+# Introduction
+
+When scheduling large-scale jobs in Flink 1.12, a lot of time is required to 
initialize jobs and deploy tasks. The scheduler also requires a large amount of 
heap memory in order to store the execution topology and host temporary 
deployment descriptors. For example, for a job with a topology that contains 
two vertices connected with an all-to-all edge and a parallelism of 10k (which 
means there are 10k source tasks and 10k sink tasks and every source task is 
connected to all sink tasks),  [...]
+
+Furthermore, task deployment may block the JobManager's main thread for a long 
time and the JobManager will not be able to respond to any other requests from 
TaskManagers. This could lead to heartbeat timeouts that trigger a failover. In 
the worst case, this will render the Flink cluster unusable because it cannot 
deploy the job.
+
+To improve the performance of the scheduler for large-scale jobs, we've 
implemented several optimizations in Flink 1.13 and 1.14:
+
+1. Introduce the concept of consuming groups to optimize procedures related to 
the complexity of topologies, including the initialization, scheduling, 
failover, and partition release. This also reduces the memory required to store 
the topology;
+2. Introduce a cache to optimize task deployment, which makes the process 
faster and requires less memory;
+3. Leverage characteristics of the logical topology and the scheduling 
topology to speed up the building of pipelined regions.
+
+# Benchmarking Results
+
+To estimate the effect of our optimizations, we conducted several experiments 
to compare the performance of Flink 1.12 (before the optimization) with Flink 
1.14 (after the optimization). The job in our experiments contains two vertices 
connected with an all-to-all edge. The parallelisms of these vertices are both 
10K. To make temporary deployment descriptors distributed via the blob server, 
we set the configuration 
[blob.offload.minsize]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/docs [...]
+
+<center>
+Table 1 - The comparison of time cost between Flink 1.12 and 1.14
+<table width="95%" border="1">
+  <thead>
+    <tr>
+      <th style="text-align: center">Procedure</th>
+      <th style="text-align: center">1.12</th>
+      <th style="text-align: center">1.14</th>
+      <th style="text-align: center">Reduction(%)</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: center">Job Initialization</td>
+      <td style="text-align: center">11,431ms</td>
+      <td style="text-align: center">627ms</td>
+      <td style="text-align: center">94.51%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Task Deployment</td>
+      <td style="text-align: center">63,118ms</td>
+      <td style="text-align: center">17,183ms</td>
+      <td style="text-align: center">72.78%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Computing tasks to restart when 
failover</td>
+      <td style="text-align: center">37,195ms</td>
+      <td style="text-align: center">170ms</td>
+      <td style="text-align: center">99.55%</td>
+    </tr>
+  </tbody>
+</table>
+</center>
+
+<br/>
+In addition to quicker speeds, the memory usage is significantly reduced. It 
requires 30 GiB heap memory for a JobManager to deploy the test job and keep it 
running stably with Flink 1.12, while the minimum heap memory required by the 
JobManager with Flink 1.14 is only 2 GiB.
+
+There are also less occurrences of long-term garbage collection. When running 
the test job with Flink 1.12, a garbage collection that lasts more than 10 
seconds occurs during both job initialization and task deployment. With Flink 
1.14, since there is no long-term garbage collection, there is also a decreased 
risk of heartbeat timeouts, which creates better cluster stability.
+
+In our experiment, it took more than 4 minutes for the large-scale job with 
Flink 1.12 to transition to running (excluding the time spent on allocating 
resources). With Flink 1.14, it took no more than 30 seconds (excluding the 
time spent on allocating resources). The time cost is reduced by 87%. Thus, for 
users who are running large-scale jobs for production and want better 
scheduling performance, please consider upgrading Flink to 1.14.
+
+In [part two](/2022/01/04/scheduler-performance-part-two) of this blog post, 
we are going to talk about these improvements in detail.
diff --git a/_posts/2022-01-04-scheduler-performance-part-two.md 
b/_posts/2022-01-04-scheduler-performance-part-two.md
new file mode 100644
index 0000000..87ce0af
--- /dev/null
+++ b/_posts/2022-01-04-scheduler-performance-part-two.md
@@ -0,0 +1,148 @@
+---
+layout: post
+title: "How We Improved Scheduler Performance for Large-scale Jobs - Part Two"
+date: 2022-01-04T08:00:00.000Z
+authors:
+- Zhilong Hong:
+  name: "Zhilong Hong"
+- Zhu Zhu:
+  name: "Zhu Zhu"
+- Daisy Tsang:
+  name: "Daisy Tsang"
+- Till Rohrmann:
+  name: "Till Rohrmann"
+  twitter: "stsffap"
+
+excerpt: Part one of this blog post briefly introduced the optimizations we’ve 
made to improve the performance of the scheduler; compared to Flink 1.12, the 
time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is 
significantly reduced. In part two, we will elaborate on the details of these 
optimizations.
+---
+
+[Part one](/2022/01/04/scheduler-performance-part-one) of this blog post 
briefly introduced the optimizations we’ve made to improve the performance of 
the scheduler; compared to Flink 1.12, the time cost and memory usage of 
scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part 
two, we will elaborate on the details of these optimizations.
+
+{% toc %}
+
+# Reducing complexity with groups
+
+A distribution pattern describes how consumer tasks are connected to producer 
tasks. Currently, there are two distribution patterns in Flink: pointwise and 
all-to-all. When the distribution pattern is pointwise between two vertices, 
the [computational complexity](https://en.wikipedia.org/wiki/Big_O_notation) of 
traversing all edges is O(n). When the distribution pattern is all-to-all, the 
complexity of traversing all edges is O(n<sup>2</sup>), which means that 
complexity increases rapidl [...]
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg"
 width="75%"/>
+<br/>
+Fig. 1 - Two distribution patterns in Flink
+</center>
+
+<br/>
+In Flink 1.12, the 
[ExecutionEdge]({{site.DOCS_BASE_URL}}flink-docs-release-1.12/api/java/org/apache/flink/runtime/executiongraph/ExecutionEdge.html)
 class is used to store the information of connections between tasks. This 
means that for the all-to-all distribution pattern, there would be 
O(n<sup>2</sup>) ExecutionEdges, which would take up a lot of memory for 
large-scale jobs. For two 
[JobVertices]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph
 [...]
+
+As we can see in Fig. 1, for two JobVertices connected with the all-to-all 
distribution pattern, all 
[IntermediateResultPartitions]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.html)
 produced by upstream 
[ExecutionVertices]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.html)
 are [isomorphic](https://en.wikipedia.org/wiki/Isomorphism), which means  [...]
+
+For the all-to-all distribution pattern, since all downstream 
ExecutionVertices belonging to the same JobVertex are isomorphic and belong to 
a single group, all the result partitions they consume are connected to this 
group. This group is called 
[ConsumerVertexGroup]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumerVertexGroup.html).
 Inversely, all the upstream result partitions are grouped into a single group, 
and all the consume [...]
+
+The basic idea of our optimizations is to put all the vertices that consume 
the same result partitions into one ConsumerVertexGroup, and put all the result 
partitions with the same consumer vertices into one ConsumedPartitionGroup.
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/2-groups.svg" 
width="80%"/>
+<br/>
+Fig. 2 - How partitions and vertices are grouped w.r.t. distribution patterns
+</center>
+
+<br/>
+When scheduling tasks, Flink needs to iterate over all the connections between 
result partitions and consumer vertices. In the past, since there were 
O(n<sup>2</sup>) edges in total, the overall complexity of the iteration was 
O(n<sup>2</sup>). Now ExecutionEdge is replaced with ConsumerVertexGroup and 
ConsumedPartitionGroup. As all the isomorphic result partitions are connected 
to the same downstream ConsumerVertexGroup, when the scheduler iterates over 
all the connections, it just need [...]
+
+For the pointwise distribution pattern, one ConsumedPartitionGroup is 
connected to one ConsumerVertexGroup point-to-point. The number of groups is 
the same as the number of ExecutionEdges. Thus, the computational complexity of 
iterating over the groups is still O(n).
+
+For the example job we mentioned above, replacing ExecutionEdges with the 
groups can effectively reduce the memory usage of 
[ExecutionGraph]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.html)
 from more than 4 GiB to about 12 MiB. Based on the concept of groups, we 
further optimized several procedures, including job initialization, scheduling 
tasks, failover, and partition releasing. These procedures are all involved 
with tr [...]
+
+# Optimizations related to task deployment
+
+## The problem
+
+In Flink 1.12, it takes a long time to deploy tasks for large-scale jobs if 
they contain all-to-all edges. Furthermore, a heartbeat timeout may happen 
during or after task deployment, which makes the cluster unstable.
+
+Currently, task deployment includes the following steps:
+
+1. A JobManager creates 
[TaskDeploymentDescriptors]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.html)
 for each task, which happens in the JobManager's main thread;
+2. The JobManager serializes TaskDeploymentDescriptors asynchronously;
+3. The JobManager ships serialized TaskDeploymentDescriptors to TaskManagers 
via RPC messages;
+4. TaskManagers create new tasks based on the TaskDeploymentDescriptors and 
execute them.
+
+A TaskDeploymentDescriptor (TDD) contains all the information required by 
TaskManagers to create a task. At the beginning of task deployment, a 
JobManager creates the TDDs for all tasks. Since this happens in the main 
thread, the JobManager cannot respond to any other requests. For large-scale 
jobs, the main thread may get blocked for a long time, heartbeat timeouts may 
happen, and a failover would be triggered.
+
+A JobManager can become a bottleneck during task deployment since all 
descriptors are transmitted from it to all TaskManagers. For large-scale jobs, 
these temporary descriptors would require a lot of heap memory and cause 
frequent long-term garbage collection pauses.
+
+Thus, we need to speed up the creation of the TDDs. Furthermore, if the size 
of descriptors can be reduced, then they will be transmitted faster, which 
leads to faster task deployments.
+
+## The solution
+
+### Cache ShuffleDescriptors
+
+[ShuffleDescriptor]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/shuffle/ShuffleDescriptor.html)s
 are used to describe the information of result partitions that a task consumes 
and can be the largest part of a TaskDeploymentDescriptor. For an all-to-all 
edge, when the parallelisms of both upstream and downstream vertices are n, the 
number of ShuffleDescriptors for each downstream vertex is n, since they are 
connected to n upstream vertices. Thus, the to [...]
+
+However, the ShuffleDescriptors for the downstream vertices are all the same 
since they all consume the same upstream result partitions. Therefore, Flink 
doesn't need to create ShuffleDescriptors for each downstream vertex 
individually. Instead, it can create them once and cache them to be reused. 
This will decrease the overall complexity of creating TaskDeploymentDescriptors 
for tasks from O(n<sup>2</sup>) to O(n).
+
+To decrease the size of RPC messages and reduce the transmission of replicated 
data over the network, the cached ShuffleDescriptors can be compressed. For the 
example job we mentioned above, if the parallelisms of vertices are both 10k, 
then each downstream vertex has 10k ShuffleDescriptors. After compression, the 
size of the serialized value would be reduced by 72%.
+
+### Distribute ShuffleDescriptors via the blob server
+
+A [blob](https://en.wikipedia.org/wiki/Binary_large_object) (binary large 
objects) is a collection of binary data used to store large files. Flink hosts 
a blob server to transport large-sized data between the JobManager and 
TaskManagers. When a JobManager decides to transmit a large file to 
TaskManagers, it would first store the file in the blob server (will also 
upload files to the distributed file system) and get a token representing the 
blob, called the blob key. It would then transmi [...]
+
+During task deployment, the JobManager is responsible for distributing the 
ShuffleDescriptors to TaskManagers via RPC messages. The messages will be 
garbage collected once they are sent. However, if the JobManager cannot send 
the messages as fast as they are created, these messages would take up a lot of 
space in heap memory and become a heavy burden for the garbage collector to 
deal with. There will be more long-term garbage collections that stop the world 
and slow down the task deployment.
+
+To solve this problem, the blob server can be used to distribute large 
ShuffleDescriptors. The JobManager first sends ShuffleDescriptors to the blob 
server, which stores ShuffleDescriptors in the DFS. TaskManagers request 
ShuffleDescriptors from the DFS once they begin to process 
TaskDeploymentDescriptors. With this change, the JobManager doesn't need to 
keep all the copies of ShuffleDescriptors in heap memory until they are sent. 
Moreover, the frequency of garbage collections for large- [...]
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg"
 width="80%"/>
+<br/>
+Fig. 3 - How ShuffleDescriptors are distributed
+</center>
+
+<br/>
+To avoid running out of space on the local disk, the cache will be cleared 
when the related partitions are no longer valid and a size limit is added for 
ShuffleDescriptors in the blob cache on TaskManagers. If the overall size 
exceeds the limit, the least recently used cached value will be removed. This 
ensures that the local disks on the JobManager and TaskManagers won't be filled 
up with ShuffleDescriptors, especially in session mode.
+
+# Optimizations when building pipelined regions
+
+In Flink, there are two types of data exchanges: pipelined and blocking. When 
using blocking data exchanges, result partitions are first fully produced and 
then consumed by the downstream vertices. The produced results are persisted 
and can be consumed multiple times. When using pipelined data exchanges, result 
partitions are produced and consumed concurrently. The produced results are not 
persisted and can be consumed only once.
+
+Since the pipelined data stream is produced and consumed simultaneously, Flink 
needs to make sure that the vertices connected via pipelined data exchanges 
execute at the same time. These vertices form a [pipelined 
region]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/topology/PipelinedRegion.html).
 The pipelined region is the basic unit of scheduling and failover by default. 
During scheduling, all vertices in a pipelined region will be scheduled 
together [...]
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg"
 width="90%"/>
+<br/>
+Fig. 4 - The LogicalPipelinedRegion and the SchedulingPipelinedRegion
+</center>
+
+<br/>
+Currently, there are two types of pipelined regions in the scheduler: 
[LogicalPipelinedRegion]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/topology/LogicalPipelinedRegion.html)
 and 
[SchedulingPipelinedRegion]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/SchedulingPipelinedRegion.html).
 The LogicalPipelinedRegion denotes the pipelined regions on the logical level. 
It consists of JobVertices  [...]
+
+During the construction of pipelined regions, a problem arises: There may be 
cyclic dependencies between pipelined regions. A pipelined region can be 
scheduled if and only if all its dependencies have finished. However, if there 
are two pipelined regions with cyclic dependencies between each other, there 
will be a scheduling [deadlock](https://en.wikipedia.org/wiki/Deadlock). They 
are both waiting for the other one to be scheduled first, and none of them can 
be scheduled. Therefore, [Tar [...]
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg"
 width="90%"/>
+<br/>
+Fig. 5 - The topology with scheduling deadlock
+</center>
+
+<br/>
+To speed up the construction of pipelined regions, the relevance between the 
logical topology and the scheduling topology can be leveraged. Since a 
SchedulingPipelinedRegion is derived from just one LogicalPipelinedRegion, 
Flink traverses all LogicalPipelinedRegions and converts them into 
SchedulingPipelinedRegions one by one. The conversion varies based on the 
distribution patterns of edges that connect vertices in the 
LogicalPipelinedRegion.
+
+If there are any all-to-all distribution patterns inside the region, the 
entire region can just be converted into one SchedulingPipelinedRegion 
directly. That's because for the all-to-all edge with the pipelined data 
exchange, all the regions connected to this edge must execute simultaneously, 
which means they are merged into one region. For the all-to-all edge with a 
blocking data exchange, it will introduce cyclic dependencies, as Fig. 5 shows. 
All the regions it connects must be merge [...]
+
+If there are only pointwise distribution patterns inside a region, Tarjan's 
strongly connected components algorithm is still used to ensure no cyclic 
dependencies. Since there are only pointwise distribution patterns, the number 
of edges in the topology is O(n), and the computational complexity of the 
algorithm will be O(n).
+
+<center>
+<br/>
+<img 
src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg"
 width="90%"/>
+<br/>
+Fig. 6 - How to convert a LogicalPipelinedRegion to ScheduledPipelinedRegions
+</center>
+
+<br/>
+After the optimization, the overall computational complexity of building 
pipelined regions decreases from O(n<sup>2</sup>) to O(n). In our experiments, 
for the job which contains two vertices connected with a blocking all-to-all 
edge, when their parallelisms are both 10K, the time of building pipelined 
regions decreases by 99%, from 8,257 ms to 120 ms.
+
+# Summary
+
+All in all, we've done several optimizations to improve the scheduler’s 
performance for large-scale jobs in Flink 1.13 and 1.14. The optimizations 
involve procedures including job initialization, scheduling, task deployment, 
and failover. If you have any questions about them, please feel free to start a 
discussion in the dev mail list.
diff --git 
a/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg 
b/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
new file mode 100644
index 0000000..5424fbd
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="987px" 
height="357px" viewBox="-0.5 -0.5 987 357" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:32:47.369Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;WNKLNoexVU8kdb9qBtNl&quot; version=&quot;16.1.0&quot; 
type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/2-groups.svg 
b/img/blog/2022-01-05-scheduler-performance/2-groups.svg
new file mode 100644
index 0000000..f62484b
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/2-groups.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="1117px" 
height="367px" viewBox="-0.5 -0.5 1117 367" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:48:39.835Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;r17mJOWVV4jHEWX0ACX3&quot; version=&quot;16.1.0&quot; 
type=&quot;google&quot;&gt;&lt;diagram  [...]
\ No newline at end of file
diff --git 
a/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
 
b/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
new file mode 100644
index 0000000..9032535
--- /dev/null
+++ 
b/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="832px" 
height="422px" viewBox="-0.5 -0.5 832 422" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:49:38.587Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;sj7fJ-_3TWIaCKJk82m5&quot; version=&quot;16.1.0&quot; 
type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg 
b/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
new file mode 100644
index 0000000..0f4494c
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="962px" 
height="382px" viewBox="-0.5 -0.5 962 382" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:41:09.588Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;M1L6mcgOaCav-WM3zpr-&quot; version=&quot;16.1.4&quot; 
type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git 
a/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg 
b/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
new file mode 100644
index 0000000..2c743e8
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="918px" 
height="361px" viewBox="-0.5 -0.5 918 361" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:36:25.839Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;bH7J1WTlE5dDkxqGV3PL&quot; version=&quot;16.1.0&quot; 
type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git 
a/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg 
b/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
new file mode 100644
index 0000000..b2a44e0
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="971px" 
height="942px" viewBox="-0.5 -0.5 971 942" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:52:06.980Z&quot; 
agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; 
etag=&quot;W0obqumORf-6iY1HI_oF&quot; version=&quot;16.1.0&quot; 
type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file

[flink-web] 01/02: Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"

Reply via email to