Hi all, I face a critical problem about performance of my storm topology. I can only process 1000 tuples/sec from kafka by kafkaSpout. I use standard storm to set my topology(not trident), and my topology information is as follows: [Machines] I have 1 nimbus and 3 supervisors and each with 2-core CPU in GCE(google compute engine) Number of workers:12 Number of executers:51 [Topology] Number of kafkaSpout: 13(fetch 13 topics from kafka brokers) Number of Bolts: 12(There are 5 mysql-dumper bolt here)
KafkaSpout(topic) emits to boltA and boltB boltA(parallelism=9): parse the avro tuple from kafkaSpout boltB(parallelism=1): Counting number of bolt only Ifound sometimes boltA's capacity is 1 or above in storm UI, and my 5 mysql-dumper bolt's execute latency is more than 300ms(other bolts are less than 10ms). In addition, my complete latency of these kafkaspouts is more than 2000ms in the beggining, but it drops to 1000ms after a while. I found this topology can only process 1000 tuples/s or less, but my goal is to process 10000 tuples/s. Is any wrong of my topology config? Actually, my topology is doing simple thing like counting and dumping to mysql only. It seems storm not to have a good performance as it says(million of tuples in a second in 10-node cluster). Can anyone give me some suggestion? Thanks a lot. Best regards, James
