[ 
https://issues.apache.org/jira/browse/FLINK-39160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18061217#comment-18061217
 ] 

Arun Lakshman commented on FLINK-39160:
---------------------------------------

pull request to add metrics for `pekko.framesize` metrics : 
https://github.com/apache/flink/pull/27677

> [Runtime][Rpc][Metrics] Expose RPC response frame size and oversized-response 
> rejection metrics
> -----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39160
>                 URL: https://issues.apache.org/jira/browse/FLINK-39160
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / RPC
>    Affects Versions: 2.2.0
>            Reporter: Arun Lakshman
>            Priority: Minor
>              Labels: metrics, rpc
>
> Flink currently lacks metrics for RPC-level observability for serialized 
> response frame sizes and oversized-response rejections. When responses exceed 
> pekko.framesize, they are rejected, but we cannot easily see the 
> response-size trend. This makes it difficult to diagnose RPC failures, tune 
> frame-size settings, and detect payload-size regressions in production
> Today, oversized RPC responses are primarily visible only through error logs, 
> with no dedicated metric to track response sizes or rejection frequency over 
> time. This makes diagnosis reactive and noisy, since operators must grep logs 
> instead of using dashboards/alerts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to