Try the following: - 1 worker per machine (to minimize inter-jvm messaging) and adjust childopts so it takes as much memory as you can without bringing down the machine - as many threads as available cpu cores and no more (to avoid thread context switching)
That should give you some reduction of the overhead related to storm. Another thing you can try is batching (packing many messages from the spout in batches so that you get a bulk of messages to process in the bolts instead of one by one). Regards, Javier On Tue, Jul 7, 2015 at 2:47 AM, bhargav sarvepalli <[email protected]> wrote: > Hi, > > I have a topology running on aws. I use M3 xlarge machines with 15GB ram, > 8 supervisors. My topology is simple, I read from > kafka spout -> [db o/p1] -> [db o/p2] -> [dynamo fetch] -> [dynamo write] > -> kafka > > db o/ps are conditional. with latency around 100 - 150 ms. > > But I have never been able to achieve a throughput of more than 300 > msgs/sec. What configuration changes to be made so I can get a throughput > of more than 3k msgs/sec; > > dynamo fetch bolt execute latency is around 150 - 220ms > and dynamo read bolt execute latency is also around this number. > > > four bolts with parallelism 90 each and one spout with parallelism 30 (30 > kafka partitions) > > overall latency is greater than 4 secs. > > topology.message.timeout.secs: 600 > > worker.childopts: "-Xmx5120m > no. of worker ports per machine : 2 > > no of workers : 6 > no of threads : 414 > executor send buffer size 16384 > executor receive buffer size 16384 > transfer buffer size: 34 > no of ackers: 24 > > Thanks, > Bhargav S. > > > -- Javier González Nicolini
