[jira] [Updated] (KAFKA-7132) Consider adding multithreaded form of rebalancing

2018-07-04 Thread Richard Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Yu updated KAFKA-7132:
--
Description: Currently, when a consumer falls out of a consumer group, it 
will restart processing from the last checkpointed offset. However, this design 
could result in a lag which some users could not afford to let happen. For 
example, lets say a consumer crashed at offset 100, with the last checkpointed 
offset being at 70. When it recovers at a later offset (say, 120), it will be 
behind by an offset range of 50 (120 - 70). This is because the consumer 
restarted at 70, forcing it to reprocess old data. To avoid this from 
happening, one option would be to allow the current consumer to start 
processing not from the last checkpointed offset (which is 70 in the example), 
but from 120 where it recovers. Meanwhile, a new KafkaConsumer will be 
instantiated and start reading from offset 70 in concurrency with the old 
process, and will be terminated once it reaches 120. In this manner, a 
considerable amount of lag can be avoided, particularly since the old consumer 
could proceed as if nothing had happened.   (was: Currently, when a consumer 
falls out of a consumer group, it will restart processing from the last 
checkpointed offset. However, this design could result in a lag which some 
users could not afford to let happen. For example, lets say a consumer crashed 
at offset 100, with the last checkpointed offset being at 70. When it recovers 
at a later offset (say, 120), it will be behind by an offset range of 50 (120 - 
70). This is because it restarted at 70, forcing it to reprocess old data. To 
avoid this from happening, one option would be to allow the current consumer to 
start processing not from the last checkpointed offset (which is 70 in the 
example), but from 120 where it recovers. Meanwhile, a new KafkaConsumer will 
be instantiated and start reading from offset 70 in concurrency with the old 
process, and will be terminated once it reaches 120. In this manner, a 
considerable amount of lag can be avoided, particularly since the old consumer 
could proceed as if nothing had happened. )

> Consider adding multithreaded form of rebalancing
> -
>
> Key: KAFKA-7132
> URL: https://issues.apache.org/jira/browse/KAFKA-7132
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Richard Yu
>Priority: Major
>
> Currently, when a consumer falls out of a consumer group, it will restart 
> processing from the last checkpointed offset. However, this design could 
> result in a lag which some users could not afford to let happen. For example, 
> lets say a consumer crashed at offset 100, with the last checkpointed offset 
> being at 70. When it recovers at a later offset (say, 120), it will be behind 
> by an offset range of 50 (120 - 70). This is because the consumer restarted 
> at 70, forcing it to reprocess old data. To avoid this from happening, one 
> option would be to allow the current consumer to start processing not from 
> the last checkpointed offset (which is 70 in the example), but from 120 where 
> it recovers. Meanwhile, a new KafkaConsumer will be instantiated and start 
> reading from offset 70 in concurrency with the old process, and will be 
> terminated once it reaches 120. In this manner, a considerable amount of lag 
> can be avoided, particularly since the old consumer could proceed as if 
> nothing had happened. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7132) Consider adding multithreaded form of rebalancing

2018-07-04 Thread Richard Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Yu updated KAFKA-7132:
--
Summary: Consider adding multithreaded form of rebalancing  (was: Consider 
adding multithreaded form of recovery)

> Consider adding multithreaded form of rebalancing
> -
>
> Key: KAFKA-7132
> URL: https://issues.apache.org/jira/browse/KAFKA-7132
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Richard Yu
>Priority: Major
>
> Currently, when a consumer falls out of a consumer group, it will restart 
> processing from the last checkpointed offset. However, this design could 
> result in a lag which some users could not afford to let happen. For example, 
> lets say a consumer crashed at offset 100, with the last checkpointed offset 
> being at 70. When it recovers at a later offset (say, 120), it will be behind 
> by an offset range of 50 (120 - 70). This is because it restarted at 70, 
> forcing it to reprocess old data. To avoid this from happening, one option 
> would be to allow the current consumer to start processing not from the last 
> checkpointed offset (which is 70 in the example), but from 120 where it 
> recovers. Meanwhile, a new KafkaConsumer will be instantiated and start 
> reading from offset 70 in concurrency with the old process, and will be 
> terminated once it reaches 120. In this manner, a considerable amount of lag 
> can be avoided, particularly since the old consumer could proceed as if 
> nothing had happened. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)