GGraziadei opened a new pull request, #8707:
URL: https://github.com/apache/storm/pull/8707

   ## What is the purpose of the change
   
   I would like to address the scenarios where compressing serialized tuples 
actually provides a performance benefit and introduce a way to control it.
   The only scenario where compressing a serialized tuple makes sense is during 
inter-worker communication where the developer expects very large tuple sizes.
   A perfect example within the codebase is examples/FileReadWordCountTopo. In 
this topology, the FileReadSpout emits entire lines/sentences of text to the 
SplitSentenceBolt. If these two components end up on different remote workers, 
compressing the serialized tuples on this specific stream would drastically 
reduce network I/O.
   
   I added more details in `docs/Serialization.md`
   
   ## How was the change tested
   - Unit test
   - Smoke test on cluster env 
   
   ## Benchmark
   
   I executed a benchmark to evaluate the improvements introduced by this PR.
   
   I used the dev Storm cluster environment proposed in this PR:
   https://github.com/apache/storm/pull/8706
   
   I executed the WordCount topology in two versions:
   
   - `FileReadWordCountSpoutCompressionTopo` (with compression)
   - `FileReadWordCountTopo` (without compression)
   
   The transmitted sentences are all approximately 1500 bytes (contained in 
`longrandomwords.txt`).  
   This value was intentionally selected to reproduce cases where the payload 
spans more than one TCP segment.
   
   I simulated a network with 10 ms latency and 0.5 ms jitter between workers.
   
   This setup does not exactly represent a typical intra-DC network, but it 
emphasizes the maximum potential advantage that this PR can provide when 
properly configured. Consider that I am using the minimum tuple size that 
permits a real advantage.
   
   ---
   
   ## Simulated Network Ping
   
   This is a sample ping between two supervisors in the docker network 
(round-trip)
   
   ```bash
   ./netsim.sh ping
   ==> RTT from cluster-supervisor1-1 to cluster-supervisor2-1 (5 pings)
   
   PING cluster-supervisor2-1 (172.22.0.9) 56(84) bytes of data.
   64 bytes from cluster-supervisor2-1.cluster_storm (172.22.0.9): icmp_seq=1 
ttl=64 time=42.5 ms
   64 bytes from cluster-supervisor2-1.cluster_storm (172.22.0.9): icmp_seq=2 
ttl=64 time=18.8 ms
   64 bytes from cluster-supervisor2-1.cluster_storm (172.22.0.9): icmp_seq=3 
ttl=64 time=20.0 ms
   64 bytes from cluster-supervisor2-1.cluster_storm (172.22.0.9): icmp_seq=4 
ttl=64 time=20.4 ms
   64 bytes from cluster-supervisor2-1.cluster_storm (172.22.0.9): icmp_seq=5 
ttl=64 time=20.1 ms
   
   --- cluster-supervisor2-1 ping statistics ---
   5 packets transmitted, 5 received, 0% packet loss, time 4004ms
   rtt min/avg/max/mdev = 18.767/24.353/42.486/9.083 ms
   ```
   
   ## Benchmark Results
   
   | Metric | FileReadWordCountSpoutCompressionTopo | FileReadWordCountTopo | 
Difference | Better |
   |---|---:|---:|---:|---|
   | Avg Transfer Rate (msg/s) | 776,389 | 744,544 | +31,845 (+4.3%) | 
Compression |
   | Peak Transfer Rate (msg/s) | 805,700 | 790,300 | +15,400 | Compression |
   | Avg Spout Throughput (acks/s) | 98,167 | 92,844 | +5,323 (+5.8%) | 
Compression |
   | Peak Spout Throughput (acks/s) | 100,300 | 98,666 | +1,634 | Compression |
   | Avg Complete Latency (ms) | 362.48 | 376.73 | -14.25 ms (-3.8%) | 
Compression |
   | Max Complete Latency (ms) | 366.44 | 385.72 | -19.28 ms | Compression |
   | Runtime Stability | More consistent | More fluctuation |. | Compression 
(see jitter in detail per each task in Grafana snapshots)|
   
   ---
   
   ## Grafana snapshots with v2 metrics per task
   
   FileReadWordCountTopo
   
   
https://snapshots.raintank.io/dashboard/snapshot/T0Z6BqAnQlA0aMQMypNgAaHN70NZa0T4?orgId=0&from=2026-05-23T11:08:29.369Z&to=2026-05-23T11:16:41.770Z&timezone=browser&var-topology=FileReadWordCountTopo-2-1779534460&var-host=$__all&var-component=$__all&var-task=$__all&refresh=10s
   
   
   FileReadWordCountSpoutCompressionTopo
   
   
https://snapshots.raintank.io/dashboard/snapshot/hIZ7uQuN4w0C6hunVmY1C9Hh0sGUYg0y?orgId=0&from=2026-05-23T11:03:22.690Z&to=2026-05-23T11:11:48.769Z&timezone=browser&var-topology=FileReadWordCountTopo-1-1779534165&var-host=$__all&var-component=$__all&var-task=$__all&refresh=10s
   
   In the context of #8701 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to