1996fanrui commented on code in PR #20038:
URL: https://github.com/apache/flink/pull/20038#discussion_r902508106
##########
docs/content/docs/deployment/memory/network_mem_tuning.md:
##########
@@ -120,6 +120,19 @@ In order to avoid excessive data skew, the number of
buffers for each subpartiti
Unlike the input buffer pool, the configured amount of exclusive buffers and
floating buffers is only treated as recommended values. If there are not enough
buffers available, Flink can make progress with only a single exclusive buffer
per output subpartition and zero floating buffers.
+#### Overdraft buffers
+
+For each output subtask can also request up to
`taskmanager.network.memory.max-overdraft-buffers-per-gate` (by default 5)
extra overdraft buffers.
+Those buffers are only used, if despite presence of a backpressure, Flink can
not stop producing more records to the output.
+This can happen in situations like:
+- Serializing very large records, that do not fit into a single network buffer.
+- Flat Map like operator, that produces many output records per single input
record.
+- Operators that output many records either periodically or on a reaction to
some event (for example `WindowOperator`).
+
+Without overdraft buffers in such situations Flink subtask thread would block
on the backpressure, preventing for example unaligned checkpoints
+from being triggered. To mitigate this, the overdraft buffers concept has been
added. Those buffers are strictly optional and Flink can
+make progress even if the Task Manager doesn't have any spare buffers in the
global pool to be used as overdraft buffers.
Review Comment:
As I understand, subtask can't request the overdraft buffer when the global
pool is empty.
So why `Flink can
make progress even if the Task Manager doesn't have any spare buffers in the
global pool to be used as overdraft buffers.`?
##########
docs/content/docs/deployment/memory/network_mem_tuning.md:
##########
@@ -120,6 +120,19 @@ In order to avoid excessive data skew, the number of
buffers for each subpartiti
Unlike the input buffer pool, the configured amount of exclusive buffers and
floating buffers is only treated as recommended values. If there are not enough
buffers available, Flink can make progress with only a single exclusive buffer
per output subpartition and zero floating buffers.
+#### Overdraft buffers
+
+For each output subtask can also request up to
`taskmanager.network.memory.max-overdraft-buffers-per-gate` (by default 5)
extra overdraft buffers.
+Those buffers are only used, if despite presence of a backpressure, Flink can
not stop producing more records to the output.
+This can happen in situations like:
+- Serializing very large records, that do not fit into a single network buffer.
+- Flat Map like operator, that produces many output records per single input
record.
+- Operators that output many records either periodically or on a reaction to
some event (for example `WindowOperator`).
Review Comment:
some events
##########
docs/content/docs/deployment/memory/network_mem_tuning.md:
##########
@@ -120,6 +120,19 @@ In order to avoid excessive data skew, the number of
buffers for each subpartiti
Unlike the input buffer pool, the configured amount of exclusive buffers and
floating buffers is only treated as recommended values. If there are not enough
buffers available, Flink can make progress with only a single exclusive buffer
per output subpartition and zero floating buffers.
+#### Overdraft buffers
+
+For each output subtask can also request up to
`taskmanager.network.memory.max-overdraft-buffers-per-gate` (by default 5)
extra overdraft buffers.
+Those buffers are only used, if despite presence of a backpressure, Flink can
not stop producing more records to the output.
+This can happen in situations like:
+- Serializing very large records, that do not fit into a single network buffer.
+- Flat Map like operator, that produces many output records per single input
record.
+- Operators that output many records either periodically or on a reaction to
some event (for example `WindowOperator`).
+
+Without overdraft buffers in such situations Flink subtask thread would block
on the backpressure, preventing for example unaligned checkpoints
+from being triggered. To mitigate this, the overdraft buffers concept has been
added. Those buffers are strictly optional and Flink can
Review Comment:
> preventing for example unaligned checkpoints from being triggered.
The overdraft buffer can speed up triggering the unaligned checkpoint of
subtask, but cannot speed up triggering Checkpoint of Flink job. The trigger
checkpoint we usually talk about is the job level, here is the subtask level.
So I think we should write clear `subtask level`.
This may confuse users, and is it appropriate to use trigger here? Are there
other more appropriate words? How to let the user know that with the overdraft
buffer, the subtask can start the Unaligned Checkpoint as soon as possible,
instead of always blocking in requestMemory(or backpressure).
##########
docs/content/docs/deployment/memory/network_mem_tuning.md:
##########
@@ -120,6 +120,19 @@ In order to avoid excessive data skew, the number of
buffers for each subpartiti
Unlike the input buffer pool, the configured amount of exclusive buffers and
floating buffers is only treated as recommended values. If there are not enough
buffers available, Flink can make progress with only a single exclusive buffer
per output subpartition and zero floating buffers.
+#### Overdraft buffers
+
+For each output subtask can also request up to
`taskmanager.network.memory.max-overdraft-buffers-per-gate` (by default 5)
extra overdraft buffers.
+Those buffers are only used, if despite presence of a backpressure, Flink can
not stop producing more records to the output.
Review Comment:
> These buffers are only used, if the subtask is backpressured by downstream
subtasks and the subtask still needs to produce more records to the output.
I prefer this, what do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]