[ https://issues.apache.org/jira/browse/KAFKA-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024066#comment-16024066 ]
Jason Gustafson edited comment on KAFKA-5211 at 5/25/17 2:21 AM: ----------------------------------------------------------------- [~enothereska] Looking toward the future, it would be good to get your take on what the ideal behavior would be for Kafka streams. Thanks to some of the improvements in this release cycle, the current "intended" semantics may now actually be usable (which wasn't the case before-as is apparent from the discussion in KAFKA-4740). When we raise an exception from parsing or record corruption, the consumer's position should now be pointing to the offset of the record that failed parsing, which means the user can seek past it if they wish. Because of the reasons mentioned by Becket in the description, I don't think it will be safe to proactively skip past a corrupt record-at least not without refetching the data--but it may be possible for parsing errors. It has also been proposed to move parsing out of the consumer entirely (see KAFKA-1895), which would neatly solve part of the problem, but would likely require a KIP. Given the short horizon for this release, however, Becket's patch is probably the way to go for now. was (Author: hachikuji): [~enothereska] Looking toward the future, it would be good to get your take on what the ideal behavior would be for Kafka streams. Thanks to some of the improvements in this release cycle, the current "intended" semantics may now actually be usable (which wasn't the case before--as is apparent from the discussion in KAFKA-4740). When we raise an exception from parsing or record corruption, the consumer's position should now be pointing to the offset of the record that failed parsing, which means the user can seek past it if they wish. Because of the reasons mentioned by Becket in the description, I don't think it will be safe to proactively skip past a corrupt record--at least not without refetching the data--but it may be possible for parsing errors. It has also been proposed to move parsing out of the consumer entirely (see KAFKA-1895), which would neatly solve part of the problem, but would likely require a KIP. Given the short horizon for this release, however, Becket's patch is probably the way to go for now. > KafkaConsumer should not skip a corrupted record after throwing an exception. > ----------------------------------------------------------------------------- > > Key: KAFKA-5211 > URL: https://issues.apache.org/jira/browse/KAFKA-5211 > Project: Kafka > Issue Type: Bug > Reporter: Jiangjie Qin > Assignee: Jiangjie Qin > Labels: clients, consumer > Fix For: 0.11.0.0 > > > In 0.10.2, when there is a corrupted record, KafkaConsumer.poll() will throw > an exception and block on that corrupted record. In the latest trunk this > behavior has changed to skip the corrupted record (which is the old consumer > behavior). With KIP-98, skipping corrupted messages would be a little > dangerous as the message could be a control message for a transaction. We > should fix the issue to let the KafkaConsumer block on the corrupted messages. -- This message was sent by Atlassian JIRA (v6.3.15#6346)