Hi Everyone,

I have a topology where a highly CPU-intensive bolt (Bolt A) requires a
much higher degree of parallelism than the bolt it emits tuples to (Bolt B)
(200 Bolt A executors vs <= 100 Bolt B executors).

I find that the throughput, as measured in number of tuples acked, goes
from 7 million/minute to ~ 1 million/minute when I wire in Bolt B--even if
all of the logic within the Bolt B execute method is disabled and the Bolt
B is therefore simply acking the input tuples from Bolt A. In addition, I
find that, going from 50 to 100 Bolt B executors causes the throughput to
go from 900K/minute to ~ 1.1 million/minute.

Is the fact that I am going from 200 bolt instances to 100 or less the
problem?   I've already experimented with executor.send.buffer.size and
executor.receive.buffer.size, which helped drive throughput from 800K to
900K. I will try topology.transfer.buffer.size, perhaps set that higher to
2048. Any other ideas?

Thanks

--John

Reply via email to