Hi Guozhang,
Thank you!
Could I say the consumer 'take turns to consume' is resulted by the correspond
partition got the 'message write'?
The problem I am facing is my 'enrichment' (getting more data based on raw
data) consumer took too much time to complete one message consumption. To
explore more parallel, could I say my only choice is 'decouple consumer
consumption with enrichment'?
Mingtao Sent from iPhone
> On Aug 12, 2014, at 1:10 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> Hello Mingtao,
>
> The partition will not be re-assigned to other consumers unless the current
> consumer fails, so the behavior you described will not be expected.
>
> Guozhang
>
>
> On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang <mail2ming...@gmail.com>
> wrote:
>
>> Hi Guozhang,
>>
>> I do have another Email talking about Partitions per topic. I paste it
>> within this Email.
>>
>> I am expecting those consumers will work concurrently. The behavior I
>> observed here is consumer thread-1 will work a while, then thread-3 will
>> work, then thread-0 ..., is it normal?
>>
>> version is 2.2.0.
>>
>> Best Regards,
>> Mingtao
>>
>>> On Wed, Jul 23, 2014 at 7:57 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>>>
>>> num.partitions is only used as a default value when the createTopic
>> command
>>> does not specify the num.partitions or it is automatically created. In
>> your
>>> case since you always use its value in the createTopic you will always
>> can
>>> one partition. Try change your code to sth. like:
>>>
>>> String[] args = new String[]{
>>> "--zookeeper", config.getString("zookeeper"),
>>> "--topic", config.getString("topic"),
>>> "--replica", config.getString("replicas"),
>>> "--partition", "8"
>>> };
>>>
>>> CreateTopicCommand.main(args);
>>>
>>>
>>>
>>> On Wed, Jul 23, 2014 at 4:38 PM, Mingtao Zhang <mail2ming...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> In kafka.properties, I put (forgot to change):
>>>>
>>>> num.partitions=1
>>>>
>>>> While I create topics programatically:
>>>>
>>>> String[] args = new String[]{
>>>> "--zookeeper", config.getString("zookeeper"),
>>>> "--topic", config.getString("topic"),
>>>> "--replica", config.getString("replicas"),
>>>> "--partition", config.getString("partitions")
>>>> };
>>>>
>>>> CreateTopicCommand.main(args);
>>>>
>>>> The performance engineer told me only one consumer thread is actively
>>>> working even I have 4 consumer threads started (could see when
>> debugging
>>> or
>>>> in thread dump); and 4 partitions configured from the args.
>>>>
>>>> It seems that num.partitions is still controlling the parallelism. Do I
>>>> need to change this num.partitions accordingly? Could I remove it? What
>>> is
>>>> I have different parallel requirement for different topic?
>>>>
>>>> Thank you in advance!
>>>>
>>>> Best Regards,
>>>> Mingtao
>>
>>
>>> On Mon, Aug 11, 2014 at 7:37 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>>>
>>> Mingtao,
>>>
>>> How many partitions of the consumed topic has? Basically the data is
>>> distributed per-partition, and hence if the number of consumers is larger
>>> than the number of partitions, some consumers will not get any data.
>>>
>>> Guozhang
>>>
>>>
>>> On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang <mail2ming...@gmail.com>
>>> wrote:
>>>
>>>> Is it anyhow related to the issue?
>>>>
>>>> WARN No previously checkpointed highwatermark value found for topic RAW
>>>> partition 0. Returning 0 as the highwatermark
>>>> (kafka.server.HighwaterMarkCheckpoint)
>>>>
>>>> Mingtao
>>>
>>>
>>>
>>> --
>>> -- Guozhang
>
>
>
> --
> -- Guozhang