I have this issue with any amount of load. Different max spout pendings do not seem to make much a difference. I've lowered this parameter to 100, still a little difference . At this point the bolt consuming the data does no processing.
On Wed, Feb 4, 2015 at 3:26 PM, Haralds Ulmanis <[email protected]> wrote: > I'm not sure, that i understand your problem .. but here is few points: > If you have large pending spout size and slow processing - you will see > large latency at kafka spout probably. Spout emits message .. it stays in > queue for long time (that will add latency) .. and finally is processed and > ack received. You will see queue time + processing time in kafka spout > latency. > Take a look at load factors of your bolts - are they close to 1 or more ? > and load factor of kafka spout. > > On 4 February 2015 at 21:19, Andrey Yegorov <[email protected]> > wrote: > >> have you tried increasing max spout pending parameter for the spout? >> >> builder.setSpout("kafka", >> new KafkaSpout(spoutConfig), >> TOPOLOGY_NUM_TASKS_KAFKA_SPOUT) >> .setNumTasks(TOPOLOGY_NUM_TASKS_KAFKA_SPOUT) >> //the maximum parallelism you can have on a KafkaSpout is the >> number of partitions >> .setMaxSpoutPending(*TOPOLOGY_MAX_SPOUT_PENDING*); >> >> ---------- >> Andrey Yegorov >> >> On Tue, Feb 3, 2015 at 4:03 AM, clay teahouse <[email protected]> >> wrote: >> >>> Hi all, >>> >>> In my topology, kafka spout is responsible for over 85% of the latency. >>> I have tried different spout max pending and played with the buffer size >>> and fetch size, still no luck. Any hint on how to optimize the spout? The >>> issue doesn't seem to be with the kafka side, as I see high throughput with >>> the simple kafka consumer. >>> >>> thank you for your feedback >>> Clay >>> >>> >> >
