Ah, thanks, figured it out now. 

What kind of bottlenecks should I expect to run into if I'm looking at 10s of 
1000s of partitions for a topic? The amount of data passing through each 
partition or in aggregate is somewhat small (few 100 GB per day across all 
partitions). The high partition count is because it simplifies application 
semantics.

----- Original Message -----
From: balaji.sesha...@dish.com
To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), users@kafka.apache.org
At: Apr  8 2014 14:08:41

I think you are looking for accessing messages from set of partitions by your 
own policy.You should use simple consumers in 0.8 and maintain the offsets you 
have read.

https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

If it is 0.9 I'm yet to come upto speed.

Thanks,

Balaji


-----Original Message-----
From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) [mailto:skada...@bloomberg.net] 
Sent: Tuesday, April 08, 2014 11:58 AM
To: users@kafka.apache.org
Subject: Single thread, Multiple partitions

Let's say I've a single consumer thread reading off multiple partitions (I'll 
have around 10K partitions). As per the documentation on 
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example, there 
are no guarantees on the order in which messages are read off the set of 
partitions. If I wanted to enforce priority-weighted round robin reads off the 
partitions, could I get a pointer on what code to fiddle with? Thanks!


Reply via email to