Hi Danijel - What sort of hardware are your Kafka brokers and Storm workers running on for 400k msgs/s from Kafka example? (We're also running into a throughput problem but we haven't run a simplified topology such as the one you mention to benchmark yet. I'll email out our specs and stuff in a post to the list soon.)
-Cody On Tue, Jun 24, 2014 at 11:13 AM, <[email protected]> wrote: > Hi, > > Perhaps MySQL is the bottleneck, I'll try it. However, if some bolt is > very busy, will storm be slower to emit tuples? My message type is an avro > from kafka, and each avro message is about 3KB. What types of message do > you fetch from kafka? > > Another import question is what kafka-storm do you use? I see so many > different versions of them and make me confused. Can you share storm config > in your topology and kafkaSpout's config to me? > > Thank you very much! > > > Best regards, > James Fu > > > > Danijel Schiavuzzi <[email protected]> 於 2014/6/25 上午12:02 寫道: > > Try to run the topology without the MySQL bolt to find out if that's the > bottleneck. Do you update the database in batches? That's an essential > optimization you should implement. > > With a two node Storm cluster I can fetch 450 000 messages/s from Kafka, > and that's with a Trident transactional topology (just the spout and a > debug filter bolt). Kafka has two nodes with 4 partitions only. Basic Storm > should be faster. > On Jun 24, 2014 4:12 PM, <[email protected]> wrote: > > > > Hi all, > > > > I face a critical problem about performance of my storm topology. I can > only process 1000 tuples/sec from kafka by kafkaSpout. I use standard storm > to set my topology(not trident), and my topology information is as follows: > > [Machines] > > I have 1 nimbus and 3 supervisors and each with 2-core CPU in GCE(google > compute engine) > > Number of workers:12 > > Number of executers:51 > > [Topology] > > Number of kafkaSpout: 13(fetch 13 topics from kafka brokers) > > Number of Bolts: 12(There are 5 mysql-dumper bolt here) > > > > KafkaSpout(topic) emits to boltA and boltB > > boltA(parallelism=9): parse the avro tuple from kafkaSpout > > boltB(parallelism=1): Counting number of bolt only > > > > Ifound sometimes boltA's capacity is 1 or above in storm UI, and my 5 > mysql-dumper bolt's execute latency is more than 300ms(other bolts are less > than 10ms). In addition, my complete latency of these kafkaspouts is more > than 2000ms in the beggining, but it drops to 1000ms after a while. > > > > I found this topology can only process 1000 tuples/s or less, but my > goal is to process 10000 tuples/s. Is any wrong of my topology config? > Actually, my topology is doing simple thing like counting and dumping to > mysql only. It seems storm not to have a good performance as it > says(million of tuples in a second in 10-node cluster). Can anyone give me > some suggestion? > > > > Thanks a lot. > > > > Best regards, > > James > > -- Cody A. Ray, LEED AP [email protected] 215.501.7891
