GGraziadei opened a new pull request, #8586:
URL: https://github.com/apache/storm/pull/8586

   ## What is the purpose of the change
   In deterministic real-time processing, predictability of latency is as 
important as latency itself. This is a constraint to building a deterministic 
system. 
   
   - Mcro-burst detection: high jitter reveals short spikes that average 
latency smooths out.
   - Compliance: modern SLAs rely on percentiles (e.g., P99). Jitter is a 
strong leading indicator of tail-latency degradation.
   - Root Cause Analysis: high component jitter means GC pressure or resource 
contention; instead, high global jitter with stable components suggests network 
congestion or shuffle bottlenecks.
   - Bottleneck identification: jitter enables precise identification of where 
bottlenecks occur in the topology and helps distinguish their underlying 
causes, making performance issues easier to diagnose and resolve.
   
   To ensure negligible performance impact, I propose to use an Exponentially 
Weighted Moving Average (EWMA), following RFC 1889 logic 
https://www.rfc-editor.org/rfc/rfc1889#appendix-A.8 
   
   Mathematical Model:
   J_new = J_old + (|D_current - D_previous| - J_old) * smoothing_factor
   
   Performance impact
   - Minimal computational overhead: by utilizing an  EWMA.
   - Memory efficiency: only two persistent variables (8 bytes) per task.
   - System calls: no system calls required to track the latency (the latencies 
are already computed).
   
   ## How was the change tested
   - Unit test: introduced new test cases for `Config`, `TaskMetrics`, 
`EwmaGauge`
   - Smoke test in local: registered a topology metrics reporter and persisted  
captured metrics in the attached file
   - The package `metrics2` doesn't affect it. 
   
[worker_log.zip](https://github.com/user-attachments/files/27413715/worker_log.zip)
   
   Example results in worker logs
   ```
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__emit-count-default.m1_rate
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 30.0
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__execute-count-split:default.m1_rate
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 30.0
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__execute-latency-split:default
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 0.0
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__execute-rfc1889a-jitter-split:default
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 
0.2557194505051832
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__process-latency-split:default
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 
0.3333333333333333
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO] 
storm.worker.WordCountTopology-4-1777995769.ggraziadei-ThinkPad-E14-Gen-5.count.default.10.6700-__process-rfc1889a-jitter-split:default
   2026-05-05 17:52:07.993 c.c.m.ConsoleReporter 
metrics-console-reporter-1-thread-1 [INFO]              value = 
0.145830156234796
   ```
   
   In the context of #8583 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to