Hi, I am observing some very interesting behaviour with Apache Storm. I was testing different values for the parameters for my benchmarking topologies.
I had several 4 node setups (3 workers and a supervisor) that I was previously testing on. Recently I tried a 16 node setup (1 supervisor and 15 workers) but I found that for the same "max.spout.pending" and buffer size parameter values the topology stalls i.e. throughput =0 after a while for the 16 node cluster. However, the same parameter values work fine for 4 node setup. If I increase the buffer sizes in comparison to what was set for 4 node cluster, the topology works again. If someone could elaborate on how buffer sizes (e.g. topology.executor.receive.buffer.size, topology.executor.send.buffer.size and topology.transfer.buffer.size) are linked to max.spout.pending that would be very helpful for me. Why is it that the topology works with a certain buffer size and max.spout.pending for smaller cluster but I need to increase the buffer size significantly to make the same setting for max.spout.pending? Thank you. Muhammad Bilal
