morningman opened a new pull request #870: Optimize the consumer assignment of 
Kafka routine load job
URL: https://github.com/apache/incubator-doris/pull/870
 
 
   1. Use a data consumer group to share a single stream load pipe with multi 
data consumers. This will increase the consuming speed of Kafka messages, as 
well as reducing the task number of routine
   load job. 
   But unfortunately, the test shows that 3 consumers to consume 3 partitions 
has same consuming rate as 1 consumer consumes 3 partitions. And the bottle 
neck is at fetching messages from Kafka. I don't know why, so I add a Backend 
config `max_consumer_num_per_group` to change the number of consumers in a data 
consumer group, and default value is 1.
   
       * 1 of 3 consumers (consume cost is the time we call 
`consumer->consume()`) 
   total cost(ms): 20165, consume cost(ms): 18665, received rows: 601807
       * 1 of 1 consumers:
   total cost(ms): 20051, consume cost(ms): 17259, received rows: 1686118
   
   2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to