Richard Yu created KAFKA-9285:
---------------------------------

             Summary: Implement failed message topic to account for processing 
lag during failure
                 Key: KAFKA-9285
                 URL: https://issues.apache.org/jira/browse/KAFKA-9285
             Project: Kafka
          Issue Type: New Feature
          Components: consumer
            Reporter: Richard Yu


Presently, in current Kafka failure schematics, when a consumer crashes, the 
user is typically responsible for both detecting as well as restarting the 
failed consumer. Therefore, during this period of time, when the consumer is 
dead, it would result in a period of inactivity where no records are consumed, 
hence lag results. Previously, there has been attempts to resolve this problem: 
when failure is detected by broker, a substitute consumer will be started (the 
so-called [Rebalance 
Consumer|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-333%3A+Add+faster+mode+of+rebalancing]])
 which will continue processing records in Kafka's stead. 

However, this has complications, as records will only be stored locally, and in 
case of this consumer failing as well, that data will be lost. Instead, we need 
to consider how we can still process these records and at the same time 
effectively _persist_ them. It is here that I propose the concept of a _failed 
message topic._ At a high level, it works like this. When we find that a 
consumer has failed, messages which was originally meant to be sent to that 
consumer would be redirected to this failed messaged topic. The user can choose 
to assign consumers to this topic, which would consume messages from failed 
consumers while other consumer threads are down. 

Naturally, records from different topics can not go into the same failed 
message topic, since we cannot tell which records belong to which consumer.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to