Hi Espen, Kafkas throughput can be increased by adding more than one executor. Your topic partitions must match to the kafka spout parallelism as the topic partitions are parallelism units. From your bolt are you doing ack for a tuple and how complex is the bolts execute method does it do any external calls to other systems. Thanks, Harsha
On March 25, 2015 at 12:45:01 PM, Espen Fjellvær Olsen ([email protected]) wrote: Hi, We are only just starting to play with Storm, and are a bit baffled by all the knobs and leavers that can be tweaked for more or less througput/latency. We are using Kafka and the storm KafkaSpout to pull in data that is somewhere between 500 - 700 bytes large. With only one spout executor and one CountBolt (Thingie that counts and logs number of tuples per second) executor we get about 60k tuples per second. This is a bit low as we do not see any resource exhaustion on neither the Kafka nor the Storm side. If we turn up the Kafka spouts fetchSize (a lot, 1000 times the default) we exhaust the network while the spout is filling its buffers, once the buffers are filled the network speed drops to zero and the spout starts emitting tuples. So the overall tuples per second isnt as high as it could be since the spout doesnt read data from Kafka, and emit tuples at the same time. Anyways.. 60k isnt that bad.. and with more executors we will be getting more. However, the real problem is that the througput drops about 60 % once we introduce our first bolt, and I cannot fathom why. Execute latency of the bolt is about 0.047, and the capacity is about 0.095. From what I understand we have miles to go on the capacity. CPU, disk or network resources are nowhere near their max levels. If I turn of acking in Storm we get only about 20 % throughput drop, which is better, but still very bad considering the topology is basically idling. Increasing the number of ackers do not help. Raising topology.max.spout.pending (many orders of magnitude) does not seem to have any impact on the total tps. Neither does fiddlign with the various topology.executor.receive.buffer.size/topology.executor.send.buffer.size/topology.receiver.buffer.size/topology.transfer.buffer.size settings. I feel like we have fiddled with all the obvious settings without seeing any noteworthy throughput increase. Hope anyone could shed some light on what we should look into. -- Best regards Espen Fjellvær Olsen [email protected]
