Hello all

I am facing an issue or at least something I cannot figure out regarding
the end-to-end latency of my topology.
So I guess you will be able to help me.

*Topology *
I am running a trident topology
- a IBatchSpout emits events of different types (1 spout only)
- a filter is applied and check the "instance of" the events. In most of
the case, this filter returns true. (b-1)
- A bolt (b-0) flattern the event
- a group by 3 fields of the event is done
- the events are aggregated using the persistentAggregate methods, a custom
reducer and a simple MemoryMapState object

Due to my current data set, The events are grouped into 19 different
aggregates.

In these cases, I run
- a single worker with a single spout and multiple executors (10)
- emit interval millis = 1
- TOPOLOGY_EXECUTOR_RECEIVER / SEND_BUFFER SIZE are set to a large value
16384.

I measure my end-to-end latency by adding a timestamp in the event, just
before the _collector.emit() and then, in the reducer, by measuring the
delta. So this latency is the time an event needs to be aggregated.

Here are the numbers I got

  case#1 e*vents throughput*  *40000*                   capacity Execute
latency (ms) process latency (ms) complete latency(ms)  mastercoord-bg0
  10.031  $spoutcoord-spout0 0.022 0.243 0.181    __acker 0.004 0.004 0.002
   b-0 0.045 0.006 0.002    b-1 0.159 0.004 0.003    spout0 0.150 1.673
1.919    *measured end2end latency (99%centile) *       *9.5*             *case
#2 events throughput* *250*        mastercoord-bg0       10.026
$spoutcoord-spout0 0.024 0.255 0.190    __acker 0.006 0.005 0.002    b-0
0.006 0.020 0.024    b-1 0.004 0.018 0.016    spout0 0.017 0.185 0.135
   *measured
end2end latency (99%centile)*       *9.5*


The good news is that the end-to-end latency is quite stable where the
throughput of generated events has significantly increase.
The 'not so good' new is that I would like my latency to be under 1ms
(again, a single worker and at least for low throughput). Do you have any
ideas about how to configure the topology? maybe avoid using Trident and
use Storm API directly can help?

Do you know how I can better understand where I spend my time in this
topology?

Note that I have tried to increase the max.spout.pending but the end-2-end
latency is worst (like if everything was, in the end, mono threaded)

For sure, there is something I do not understand. Many thx for you help.

Olivier

PS: when I run the second case in local mode (on my dev laptop), I observe
a end-2-end latency around 1 to 2 ms.

Reply via email to