GGraziadei opened a new issue, #8583:
URL: https://github.com/apache/storm/issues/8583

   Currently, Apache Storm provides comprehensive metrics for throughput and 
average latency (execute-latency, process-latency). However, in high-precision 
real-time systems, averages often mask critical performance instabilities.
   
   This proposal introduces a native Jitter Metric calculated at two levels:
   - Component level (Step Jitter): Measures the variance in execution time 
within individual Bolts and Spouts.
   - Topology level (Global Jitter): Measures the variance in e2e completion 
latency for fully acked tuples.
   Motivation: Why Jitter Matters for Real-Time
   
   In deterministic real-time processing, the variance of the latency is as 
important as the latency itself 
(https://ieeexplore.ieee.org/abstract/document/10877871). 
   
   ## Why analysing jitter matters for real-time
   
   In deterministic real-time processing, predictability of latency is as 
important as latency itself. This is a constraint to building a deterministic 
system. 
   
   - Mcro-burst detection: high jitter reveals short spikes that average 
latency smooths out.
   - Compliance: modern SLAs rely on percentiles (e.g., P99). Jitter is a 
strong leading indicator of tail-latency degradation.
   - Root Cause Analysis: high component jitter means GC pressure or resource 
contention; instead, high global jitter with stable components suggests network 
congestion or shuffle bottlenecks.
   - Bottleneck identification: jitter enables precise identification of where 
bottlenecks occur in the topology and helps distinguish their underlying 
causes, making performance issues easier to diagnose and resolve.
   
   ### Proposed model: Exponentially Weighted Moving Average (EWMA)
   
   To ensure negligible performance impact, I propose to use an Exponentially 
Weighted Moving Average (EWMA), following RFC 1889 logic 
https://www.rfc-editor.org/rfc/rfc1889#appendix-A.8 
   
   Mathematical Model:
   J_new = J_old + (|D_current - D_previous| - J_old) / 16
   
   ```
   GIVEN a State {ewmaJitter, lastTransit}
   PROCEDURE addValue(transitMs)
       IF transitMs < 0 THEN 
           EXIT PROCEDURE
   
       IF lastTransit IS NOT UNINITIALIZED THEN
           // Calculate the absolute difference between the current and 
previous transit time
           deviation = ABS(transitMs - lastTransit)
           
           // Update the Exponentially Weighted Moving Average using the RFC 
1889 smoothing factor
           ewmaJitter = ewmaJitter + (deviation - ewmaJitter) * RFC1889_ALPHA
       END IF
   
       // Store current transit time for the next iteration
       lastTransit = transitMs
   END PROCEDURE
   ```
   
   Performance impact
   - Minimal computational overhead: by utilizing an  EWMA, we avoid the need 
for storing large datasets or sliding window buffers. The jitter is updated via 
a single linear equation, requiring only basic arithmetic.
   - Memory efficiency: The EWMA algorithm is extremely memory-light, requiring 
only a single persistent variable (8 bytes) per executor to maintain the moving 
average state, plus a reference for the previous latency sample.
   - System calls: To eliminate redundant overhead, the metric hooks into 
existing latency tracking logic. This point requires additional brainstorming 
to evaluate already sampled metrics. 
   
   ### Limitations and constraints
   - Clock skew: Global jitter may be affected in the case of unsynchronised 
nodes. However, since jitter measures variance between consecutive samples, 
constant skew cancels out mathematically.
   - Sampling bias: Low sampling rates may miss high-frequency jitter spikes.
   - Warm-up: as an EWMA-based metric, values may fluctuate initially before 
stabilizing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to