pnowojski commented on a change in pull request #18354:
URL: https://github.com/apache/flink/pull/18354#discussion_r785918553



##########
File path: docs/content.zh/docs/ops/state/checkpointing_under_backpressure.md
##########
@@ -129,6 +129,10 @@ In-flight 数据后再生成 Watermark **。如果您的 Pipeline 中使用了**
 使用对齐 Checkpoint产生**不同的结果**。如果您的 Operator 依赖于最新的 Watermark 始终可用,解决办法是将 
Watermark 
 存放在 OperatorState 中。在这种情况下,Watermark 应该使用单键 group 存放在 UnionState 以方便扩缩容。
 
+#### Interplay with long-running record processing
+
+Despite that unaligned checkpoints barriers are able to overtake all other 
records in the queue. The handling of this barrier still can be delayed if the 
current record takes a lot of time to be processed. This situation can occur 
when window operators emit heavy result or the flat map produce a lot of 
records for a single input. It also can happen in any other situation when the 
processing of the single record takes a while(a long record). As result, the 
time of the checkpoint can be higher than expected or it can be volatile from 
time to time.

Review comment:
       - Please brake long line into a couple of shorter lines
   - `window operators emit heavy result` -> `firing many timers all at once, 
for example in windowed operations`
   - Maybe let's rephrase large record and flatMap to something like:
   
   > Second problematic scenario might occur when system is being blocked 
waiting for more than one
   > memory segment availability when processing a single input record. Flink 
can not interrupt processing of
   > a single input record, and unaligned checkpoints have to wait for the 
currently processed record to be
   > fully processed. This can cause problems in two scenarios. Either as a 
result of serialisation of a large
   > record that doesn't fit into single memory segment or in a flatMap 
operation, that produces many output
   > records for one input record. In such scenarios back pressure can block 
unaligned checkpoints until all of
   > the memory segments required to process the single input record are 
available.
   ?
   
   note: I'm not sure if we should use "memory segment" or "buffer" name 
(Depending what's used elsewhere in the docs).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to