Re: Questions about new consumer API

2015-11-18 Thread hsy...@gmail.com
That sounds like a good suggestion. I'm actually looking at the code and I will start another thread for questions about that. On Tue, Nov 17, 2015 at 5:42 PM, Jason Gustafson wrote: > Thanks for the explanation. Certainly you'd use less connections with this > approach, but

Re: Questions about new consumer API

2015-11-17 Thread hsy...@gmail.com
Thanks Guozhang, Maybe I should give a few words about what I'm going to achieve with new API Currently, I'm building a new kafka connector for Apache Apex( http://apex.incubator.apache.org/) using 0.9.0 API Apex support dynamic partition, so in the old version, We manage all the consumer

Re: Questions about new consumer API

2015-11-17 Thread Jason Gustafson
Hi Siyuan, Your understanding about assign/subscribe is correct. We think of topic subscription as enabling automatic assignment as opposed to doing manual assignment through assign(). We don't currently them to be mixed. Can you elaborate on your findings with respect to using one thread per

Re: Questions about new consumer API

2015-11-17 Thread hsy...@gmail.com
By efficiency, I mean maximize throughput while minimize resources on both broker sides and consumer sides. One example is if you have over 200 partitions on 10 brokers and you can start 5 consumer processes to consume data, if each one is single-thread and you do round-robin to distribute the

Re: Questions about new consumer API

2015-11-17 Thread Jason Gustafson
Thanks for the explanation. Certainly you'd use less connections with this approach, but it might be worthwhile to do some performance analysis to see whether there is much difference in throughput (I'd be interested in seeing these results myself). Another approach that might be interesting would

Questions about new consumer API

2015-11-16 Thread hsy...@gmail.com
The new consumer API looks good. If I understand it correctly you can use it like simple consumer or high-level consumer. But I have couple questions about it's internal implementation First of all does the consumer have any internal fetcher threads like high-level consumer? When you assign

Re: Questions about new consumer API

2015-11-16 Thread Guozhang Wang
Hi Siyuan, 1) new consumer is single-threaded, it does not maintain any internal threads as the old high-level consumer. 2) each consumer will only maintain one TCP connection with each broker. The only extra socket is the one with its coordinator. That is, if there is three brokers S1, S2, S3,

Questions about new consumer API

2014-12-02 Thread hsy...@gmail.com
Hi guys, I'm interested in the new Consumer API. http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/ I have couple of question. 1. In this doc it says kafka consumer will automatically do load balance. Is it based on throughtput or same as what we have now balance the

Re: Questions about new consumer API

2014-12-02 Thread Neha Narkhede
1. In this doc it says kafka consumer will automatically do load balance. Is it based on throughtput or same as what we have now balance the cardinality among all consumers in same ConsumerGroup? In a real case different partitions could have different peak time. Load balancing is still based on

Re: Questions about new consumer API

2014-12-02 Thread hsy...@gmail.com
Thanks Neha, another question, so if offsets are stored under group.id, dose it mean in one group, there should be at most one subscriber for each topic partition? Best, Siyuan On Tue, Dec 2, 2014 at 12:55 PM, Neha Narkhede neha.narkh...@gmail.com wrote: 1. In this doc it says kafka consumer

Re: Questions about new consumer API

2014-12-02 Thread Neha Narkhede
The offsets are keyed on group, topic, partition so if you have more than one owner per partition, they will rewrite each other's offsets and lead to incorrect state. On Tue, Dec 2, 2014 at 2:32 PM, hsy...@gmail.com hsy...@gmail.com wrote: Thanks Neha, another question, so if offsets are stored