RE: Storm's performance limits to 1000 tuples/sec

傅駿浩 Wed, 25 Jun 2014 22:22:26 -0700

Hi,
I indeed emit a tuple from one bolt to another bolt by Java serialization only. 
It's a class which stores some results that will send to another bolt. So I 
only add "implements implements Serializable" after class name to achieve this. 
Is there a simple way or example how to use Kryo serialization of such class? I 
read the storm's manual but not really understand. If there's a example, that's 
much better. Thank you a lot.




Best regards,
James Fu
Robert Turner <[email protected]> 於 2014/6/26 (週四) 12:30 AM 寫道﹕
 


Serialisation across workers might be your problem, if you can use the 
"localOrShuffle" grouping and arrange that the number of spouts and bolts is a 
multiple of the number of workers then this will minimise the serialisation 
across workers. If there is only one counting bolt for the topology then tuples 
are serialised and sent to the worker with the single counting bolt. A better 
approach might be to have a single counting bolt per worker and aggregate those 
periodically.

Regards
   Rob Turner. 



On 24 June 2014 15:10, <[email protected]> wrote:

Hi all,
>
>I face a critical problem about performance of my storm topology. I can only 
>process 1000 tuples/sec from kafka by kafkaSpout. I use standard storm to set 
>my topology(not trident), and my topology information is as follows:
>[Machines]
>I have 1 nimbus and 3 supervisors and each with 2-core CPU in GCE(google 
>compute engine)
>Number of workers:12
>Number of executers:51
>[Topology]
>Number of kafkaSpout: 13(fetch 13 topics from kafka brokers)
>Number of Bolts: 12(There are 5 mysql-dumper bolt here)
>
>KafkaSpout(topic) emits to boltA and boltB
>boltA(parallelism=9): parse the avro tuple from kafkaSpout
>boltB(parallelism=1): Counting number of bolt only
>
>Ifound sometimes boltA's capacity is 1 or above in storm UI, and my 5 
>mysql-dumper bolt's execute latency is more than 300ms(other bolts are less 
>than 10ms). In addition, my complete latency of these kafkaspouts is more than 
>2000ms in the beggining, but it drops to 1000ms after a while.
>
>I found this topology can only process 1000 tuples/s or less, but my goal is 
>to process 10000 tuples/s. Is any wrong of my topology config? Actually, my 
>topology is doing simple thing like counting and dumping to mysql only. It 
>seems storm not to have a good performance as it says(million of tuples in a 
>second in 10-node cluster). Can anyone give me some suggestion?
>
>Thanks a lot.
>
>Best regards,
>James


-- 

Cheers
   Rob.

RE: Storm's performance limits to 1000 tuples/sec

Reply via email to