MarkSfik commented on a change in pull request #328:
URL: https://github.com/apache/flink-web/pull/328#discussion_r411337742
##########
File path: _posts/2020-04-17-memory-management-improvements-flink-1.10.md
##########
@@ -0,0 +1,87 @@
+---
+layout: post
+title: "Memory Management improvements with Apache Flink 1.10"
+date: 2020-04-17T12:00:00.000Z
+authors:
+- andrey:
+ name: "Andrey Zagrebin"
+categories: news
+excerpt: This post discusses the recent changes to the memory model of the
task managers and configuration options for your Flink applications in Flink
1.10.
+---
+
+Apache Flink 1.10 comes with significant changes to the memory model of the
task managers and configuration options for your Flink applications. These
recently-introduced changes make Flink more adaptable to all kinds of
deployment environments (e.g. Kubernetes, Yarn, Mesos), providing strict
control over its memory consumption. In this post, we describe Flink’s memory
model, as it stands in Flink 1.10, how to set up and manage memory consumption
of your Flink applications and the recent changes the community implemented in
the latest Apache Flink release.
+
+## Introduction to Flink’s memory model
+
+Having a clear understanding of Apache Flink’s memory model allows you to
manage resources for the various workloads more efficiently. The following
diagram illustrates the main memory components in Flink:
+
+<center>
+<img src="{{ site.baseurl
}}/img/blog/2020-04-17-memory-management-improvements-flink-1.10/total-process-memory.svg"
width="400px" alt="Figure 1: Rule definition"/>
+<br/>
+<i><small>Flink: Total Process Memory</small></i>
+</center>
+<br/>
+
+The task manager process is a JVM process. On a high level, its memory
consists of the *JVM Heap* and *Off-Heap* memory. These types of memory are
consumed by Flink directly or by JVM for its specific purposes (i.e. metaspace
etc). There are two major memory consumers within Flink: the user code of job
operator tasks and the framework itself consuming memory for internal data
structures, network buffers etc.
+
+**Please note that** the user code has direct access to all memory types: *JVM
Heap, Direct* and *Native memory*. Therefore, Flink cannot really control its
allocation and usage. There are however two types of Off-Heap memory which are
consumed by tasks and controlled explicitly by Flink:
+
+- *Managed Off-Heap Memory*
+- *Network Buffers*
+
+The latter is part of the *JVM Direct Memory*, allocated for user record data
exchange between operator tasks.
+
+## How to set up Flink memory
+
+With the latest release of Flink 1.10 and in order to provide better user
experience, the framework comes with both high-level and fine-grained tuning of
memory components. There are essentially three alternatives to setting up
memory in task managers.
+
+The first two — and simplest — alternatives are configuring one of the two
following options for total memory available for the task manager:
+
+- *Total Process Memory*: memory consumed by the Flink application and by the
JVM to run the process.
+- *Total Flink Memory*: only memory consumed by the Flink application
+
+It is advisable to configure the *Total Flink Memory* for standalone
deployments where explicitly declaring how much memory is given to Flink is a
common practice, while the outer *JVM overhead* is of little interest. For the
cases of deploying Flink in containerized environments (such as
[Kubernetes](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html),
[Yarn](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/yarn_setup.html)
or
[Mesos](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/mesos.html)),
the *Total Process Memory* option is recommended instead, because it becomes
the size for the total memory of the requested container.
+
+If you want more fine-grained control over the size of *JVM Heap* and
*Managed* Off-Heap, there is also the second alternative to configure both
*[Task
Heap](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#task-operator-heap-memory)*
and *[Managed
Memory](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory)*.
This alternative gives a clear separation between the heap memory and any
other memory types.
+
+In line with the community’s efforts to [unify batch and stream
processing](https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html),
this model works universally for both scenarios. It allows sharing the *JVM
Heap* memory between the user code of operator tasks in any workload and the
heap state backend in stream processing scenarios. The *Managed Off-Heap
Memory* can be used for batch spilling and for the RocksDB state backend in
streaming.
+
+The remaining memory components are automatically adjusted either based on
their default values or additionally configured parameters. Flink also checks
the overall consistency. You can find more information about the different
memory components in the corresponding
[documentation](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html).
Additionally, you can try different configuration options with the
[configuration
spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE/edit#gid=0)
of
[FLIP-49](https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors)
and check the corresponding results for your individual case.
+
+If you are migrating from a Flink version older than 1.10, we suggest
following the steps in the [migration
guide](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html)
of the Flink documentation.
+
+## Other components
+
+While configuring Flink’s memory, the size of different memory components can
either be fixed with the value of the respective option or tuned using multiple
options. Below we provide some more insight about the memory setup.
+
+### Fractions of the Total Flink Memory
+
+This method allows a proportional breakdown of the *Total Flink Memory* where
the [Managed
Memory](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory)
(if not set explicitly) and [Network
Buffers](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#capped-fractionated-components)
can take certain fractions of it. The remaining memory is then assigned to the
[Task
Heap](https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#task-operator-heap-memory)
(if not set explicitly) and other fixed *JVM Heap* and *Off-Heap components*.
The following picture represents an example of such a setup:
+
+<center>
+<img src="{{ site.baseurl
}}/img/blog/2020-04-17-memory-management-improvements-flink-1.10/flink-memory-setup.svg"
width="800px" alt="Figure 1: Rule definition"/>
+<br/>
+<i><small>Flink: Example of Memory Setuo</small></i>
+</center>
+<br/>
+
+**Please note that**
+
+- Flink will verify that the size of the derived *Network Memory* is between
its minimum and maximum value, otherwise Flink’s startup will fail. The maximum
and minimum limits have default values which can be overwritten by the
respective configuration options.
+- In general, the configured fractions are treated by Flink as hints. Under
certain scenarios, the derived value might not match the fraction. For example,
if the *Total Flink Memory* and the *Task Heap* are configured to fixed values,
the *Managed Memory* will get a certain fraction and the *Network Memory* will
get the remaining memory which might not exactly match its fraction.
+
+### More hints to control the container memory limit
+
+The heap and direct memory usage is managed by the JVM. There are also many
other possible sources of native memory consumption in Apache Flink or its user
applications which are not managed by Flink or the JVM. Controlling their
limits is often difficult which complicates debugging of potential memory
leaks. If Flink’s process allocates too much memory in an unmanaged way, it can
often result in killing task manager containers in containerized environments.
In this case it may be hard to understand which type of memory consumption has
exceeded its limit. Flink 1.10 introduces some specific tuning options to
clearly represent such components. Although Flink cannot always enforce strict
limits and borders among them, the idea here is to explicitly plan the memory
usage. Below we provide some examples of how memory setup can prevent
containers exceeding their memory limit:
Review comment:
```suggestion
The heap and direct memory usage are managed by the JVM. There are also many
other possible sources of native memory consumption in Apache Flink or its user
applications which are not managed by Flink or the JVM. Controlling their
limits is often difficult which complicates debugging of potential memory
leaks. If Flink’s process allocates too much memory in an unmanaged way, it can
often result in killing Task Manager containers in containerized environments.
In this case, it may be hard to understand which type of memory consumption has
exceeded its limit. Flink 1.10 introduces some specific tuning options to
clearly represent such components. Although Flink cannot always enforce strict
limits and borders among them, the idea here is to explicitly plan the memory
usage. Below we provide some examples of how memory setup can prevent
containers exceeding their memory limit:
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]