AHeise commented on a change in pull request #10931: [FLINK-15603][metrics] Add 
checkpointStartDelay metric
URL: https://github.com/apache/flink/pull/10931#discussion_r370522304
 
 

 ##########
 File path: docs/monitoring/metrics.md
 ##########
 @@ -1341,11 +1341,16 @@ Metrics related to data exchange between task 
executors using netty network comm
       <td>Gauge</td>
     </tr>
     <tr>
-      <th rowspan="1">Task</th>
+      <th rowspan="2">Task</th>
       <td>checkpointAlignmentTime</td>
       <td>The time in nanoseconds that the last barrier alignment took to 
complete, or how long the current alignment has taken so far (in 
nanoseconds).</td>
       <td>Gauge</td>
     </tr>
+    <tr>
+      <td>checkpointStartDelay</td>
+      <td>The time in nanoseconds that elapsed between the creation of the 
last checkpoint and the time when the checkpointing process has started by this 
Task. This delay shows how long it takes for a first checkpoint barrier to 
reach the task. Back-pressure will increase this value.</td>
 
 Review comment:
   Just to make sure that this was not just an oversight:
   I also suggested to add "A high value indicates back-pressure. If only a 
specific task has a long start delay, the most likely reason is data skew." 
instead of "Back-pressure will increase this value."

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to