morningman opened a new pull request #870: Optimize the consumer assignment of Kafka routine load job URL: https://github.com/apache/incubator-doris/pull/870 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. But unfortunately, the test shows that 3 consumers to consume 3 partitions has same consuming rate as 1 consumer consumes 3 partitions. And the bottle neck is at fetching messages from Kafka. I don't know why, so I add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 1. * 1 of 3 consumers (consume cost is the time we call `consumer->consume()`) total cost(ms): 20165, consume cost(ms): 18665, received rows: 601807 * 1 of 1 consumers: total cost(ms): 20051, consume cost(ms): 17259, received rows: 1686118 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
