[
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354945#comment-16354945
]
Ewen Cheslack-Postava commented on KAFKA-6490:
----------------------------------------------
A change in behavior like that would definitely require a KIP – existing users
would not expect this at all.
Connect started with the current behavior because for many users losing data is
worse than suffering some downtime. However, it's clear some alternatives are
warranted; this question comes up from time to time on mailing lists. Generally
there are only a few options that seem to make sense:
* Stop processing (current behavior) and log
* Log and retry (really only makes sense for unusual edge cases where data got
corrupted in flight between Kafka and Connect)
* Discard and log (I care about uptime more than a bit of lost data)
* Dead letter queue (or some other fallback handler)
The retry case is probably the least important here as it will rarely make a
difference, so the other 3 are the ones I think we'd want to implement. A KIP
for this should be straightforward, though the implementation will require care
to make sure we handle all places errors can occur (in the producer/consumer,
during deserialization, during transformations, etc).
> JSON SerializationException Stops Connect
> -----------------------------------------
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
> Issue Type: Bug
> Components: KafkaConnect
> Affects Versions: 1.0.0
> Reporter: William R. Speirs
> Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a
> non-JSON message, the SerializationException message will bubble up to the
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON
> serializer is a bad idea, I think that a single malformed message stopping
> all of KafkaConnect is even worse.
> The data exception is thrown here:
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>
> From the call here:
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an
> uncaught and unrecoverable exception
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse
> are logged, but simply ignored? This way KafkaConnect can keep working even
> when it encounters a message it cannot decode?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)