Hi Espen,
             Kafkas throughput can be increased by adding more than one 
executor. Your topic partitions must match to the kafka spout parallelism as 
the topic partitions are parallelism units.  From your bolt are you doing ack 
for a tuple and how complex is the bolts execute method does it do any external 
calls to other systems.
Thanks,
Harsha


On March 25, 2015 at 12:45:01 PM, Espen Fjellvær Olsen ([email protected]) wrote:

Hi,  

We are only just starting to play with Storm, and are a bit baffled by  
all the knobs and leavers that can be tweaked for more or less  
througput/latency.  

We are using Kafka and the storm KafkaSpout to pull in data that is  
somewhere between 500 - 700 bytes large. With only one spout executor  
and one CountBolt (Thingie that counts and logs number of tuples per  
second) executor we get about 60k tuples per second. This is a bit low  
as we do not see any resource exhaustion on neither the Kafka nor the  
Storm side. If we turn up the Kafka spouts fetchSize (a lot, 1000 times  
the default) we exhaust the network while the spout is filling its  
buffers, once the buffers are filled the network speed drops to zero and  
the spout starts emitting tuples. So the overall tuples per second isnt  
as high as it could be since the spout doesnt read data from Kafka, and  
emit tuples at the same time.  

Anyways.. 60k isnt that bad.. and with more executors we will be getting  
more.  


However, the real problem is that the througput drops about 60 % once we  
introduce our first bolt, and I cannot fathom why.  

Execute latency of the bolt is about 0.047, and the capacity is about  
0.095. From what I understand we have miles to go on the capacity.  
CPU, disk or network resources are nowhere near their max levels.  

If I turn of acking in Storm we get only about 20 % throughput drop,  
which is better, but still very bad considering the topology is  
basically idling. Increasing the number of ackers do not help.  

Raising topology.max.spout.pending (many orders of magnitude) does not  
seem to have any impact on the total tps. Neither does fiddlign with the  
various  
topology.executor.receive.buffer.size/topology.executor.send.buffer.size/topology.receiver.buffer.size/topology.transfer.buffer.size
  
settings.  


I feel like we have fiddled with all the obvious settings without seeing  
any noteworthy throughput increase.  

Hope anyone could shed some light on what we should look into.  


--  
Best regards  
Espen Fjellvær Olsen  
[email protected]  


Reply via email to