[Question] Storm leads to many fails and large latency with Kafka

傅駿浩 Tue, 17 Jun 2014 19:28:17 -0700

Hi all,
My environment and conf. are as follows:
[Machines] 1 nimbus and 3 supervisors on AWS with m1.medium
[Topology] 4 Spouts(each for a topic of kafka with parallelism hint 2) and 10 
bolts
[Topology] 6 workers, 34 executors, 34 tasks


My first bolt(parallelism hint=5) is parsing data from soput, and its capacity 
is over 1.0 often. My consideration is as follows:

1. Using tick-tuple feature to write my result into mysql database:
if (TupleHelpers.isTickTuple(tuple)) {
//emit the result to next bolt
            collector.emit(new Values(result));
        }else{
//store result in memory
collector.ack(tuple);
        }
I set TOPOLOGY_TICK_TUPLE_FREQ_SECS for 30 seconds. Is it correct to emit in 
unanchor way, so that the tuple will not be tracked? I'm afraid something wrong 
here.

2. Bad way in 1 topic with 1 KafkaSpout?
Actually I will use 12 topics so taht I have 12 spouts in my topology. Is it 
good for 1 tpic for 1 spout?

3. Slow speed for my topology.
One of my bolt is connectd from spout and counting the number of tuples 
received. I found it can process 300~400 tuples/sec only...Whats wrong with my 
topology?

[storm UI]
In the beginning of start, the complete latency is over 30000 ms, and lots of 
fail tuples in "spouts" but no fail tuple in "bolts". Can anyone give me some 
advice and speed up my topology?

Best regards,
James

[Question] Storm leads to many fails and large latency with Kafka

Reply via email to