Hi, I indeed emit a tuple from one bolt to another bolt by Java serialization only. It's a class which stores some results that will send to another bolt. So I only add "implements implements Serializable" after class name to achieve this. Is there a simple way or example how to use Kryo serialization of such class? I read the storm's manual but not really understand. If there's a example, that's much better. Thank you a lot.
Best regards, James Fu Robert Turner <[email protected]> 於 2014/6/26 (週四) 12:30 AM 寫道﹕ Serialisation across workers might be your problem, if you can use the "localOrShuffle" grouping and arrange that the number of spouts and bolts is a multiple of the number of workers then this will minimise the serialisation across workers. If there is only one counting bolt for the topology then tuples are serialised and sent to the worker with the single counting bolt. A better approach might be to have a single counting bolt per worker and aggregate those periodically. Regards Rob Turner. On 24 June 2014 15:10, <[email protected]> wrote: Hi all, > >I face a critical problem about performance of my storm topology. I can only >process 1000 tuples/sec from kafka by kafkaSpout. I use standard storm to set >my topology(not trident), and my topology information is as follows: >[Machines] >I have 1 nimbus and 3 supervisors and each with 2-core CPU in GCE(google >compute engine) >Number of workers:12 >Number of executers:51 >[Topology] >Number of kafkaSpout: 13(fetch 13 topics from kafka brokers) >Number of Bolts: 12(There are 5 mysql-dumper bolt here) > >KafkaSpout(topic) emits to boltA and boltB >boltA(parallelism=9): parse the avro tuple from kafkaSpout >boltB(parallelism=1): Counting number of bolt only > >Ifound sometimes boltA's capacity is 1 or above in storm UI, and my 5 >mysql-dumper bolt's execute latency is more than 300ms(other bolts are less >than 10ms). In addition, my complete latency of these kafkaspouts is more than >2000ms in the beggining, but it drops to 1000ms after a while. > >I found this topology can only process 1000 tuples/s or less, but my goal is >to process 10000 tuples/s. Is any wrong of my topology config? Actually, my >topology is doing simple thing like counting and dumping to mysql only. It >seems storm not to have a good performance as it says(million of tuples in a >second in 10-node cluster). Can anyone give me some suggestion? > >Thanks a lot. > >Best regards, >James -- Cheers Rob.
