Hi Danijel -

What sort of hardware are your Kafka brokers and Storm workers running on
for 400k msgs/s from Kafka example? (We're also running into a throughput
problem but we haven't run a simplified topology such as the one you
mention to benchmark yet. I'll email out our specs and stuff in a post to
the list soon.)

-Cody


On Tue, Jun 24, 2014 at 11:13 AM, <[email protected]> wrote:

> Hi,
>
> Perhaps MySQL is the bottleneck, I'll try it. However, if some bolt is
> very busy, will storm be slower to emit tuples? My message type is an avro
> from kafka, and each avro message is about 3KB. What types of message do
> you fetch from kafka?
>
> Another import question is what kafka-storm do you use? I see so many
> different versions of them and make me confused. Can you share storm config
> in your topology and kafkaSpout's config to me?
>
> Thank you very much!
>
>
> Best regards,
> James Fu
>
>
>
> Danijel Schiavuzzi <[email protected]> 於 2014/6/25 上午12:02 寫道:
>
> Try to run the topology without the MySQL bolt to find out if that's the
> bottleneck. Do you update the database in batches?  That's an essential
> optimization you should implement.
>
> With a two node Storm cluster I can fetch 450 000 messages/s from Kafka,
> and that's with a Trident transactional topology (just the spout and a
> debug filter bolt). Kafka has two nodes with 4 partitions only. Basic Storm
> should be faster.
> On Jun 24, 2014 4:12 PM, <[email protected]> wrote:
> >
> > Hi all,
> >
> > I face a critical problem about performance of my storm topology. I can
> only process 1000 tuples/sec from kafka by kafkaSpout. I use standard storm
> to set my topology(not trident), and my topology information is as follows:
> > [Machines]
> > I have 1 nimbus and 3 supervisors and each with 2-core CPU in GCE(google
> compute engine)
> > Number of workers:12
> > Number of executers:51
> > [Topology]
> > Number of kafkaSpout: 13(fetch 13 topics from kafka brokers)
> > Number of Bolts: 12(There are 5 mysql-dumper bolt here)
> >
> > KafkaSpout(topic) emits to boltA and boltB
> > boltA(parallelism=9): parse the avro tuple from kafkaSpout
> > boltB(parallelism=1): Counting number of bolt only
> >
> > Ifound sometimes boltA's capacity is 1 or above in storm UI, and my 5
> mysql-dumper bolt's execute latency is more than 300ms(other bolts are less
> than 10ms). In addition, my complete latency of these kafkaspouts is more
> than 2000ms in the beggining, but it drops to 1000ms after a while.
> >
> > I found this topology can only process 1000 tuples/s or less, but my
> goal is to process 10000 tuples/s. Is any wrong of my topology config?
> Actually, my topology is doing simple thing like counting and dumping to
> mysql only. It seems storm not to have a good performance as it
> says(million of tuples in a second in 10-node cluster). Can anyone give me
> some suggestion?
> >
> > Thanks a lot.
> >
> > Best regards,
> > James
>
>


-- 
Cody A. Ray, LEED AP
[email protected]
215.501.7891

Reply via email to