[ 
https://issues.apache.org/jira/browse/KAFKA-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163021#comment-17163021
 ] 

Almog Gavra commented on KAFKA-8037:
------------------------------------

There are lots of threads going on in this discussion, but re: whether the 
optimization should be opt-in or opt-out:

[~ableegoldman] [~mjsax] - while I agree with you in theory that these (ones 
that have side effects and/or are asymmetric) serdes should be discouraged, I 
don't think that's a realistic possibility. Some of the most popular serdes 
have these properties:
 * All confluent schema registry serdes have side effects on serialization
 * AVRO reader/writer schemas are built to be asymmetric (and that's how they 
handle schema evolution)
 * JSON serdes are asymmetric if you allow "additional properties"

Moreso, many users might not even know if their serde is symmetric/has side 
effects and I think that makes it very difficult to require users to opt-out as 
opposed to allowing them to opt-in.

> KTable restore may load bad data
> --------------------------------
>
>                 Key: KAFKA-8037
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8037
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Priority: Minor
>              Labels: pull-request-available
>
> If an input topic contains bad data, users can specify a 
> `deserialization.exception.handler` to drop corrupted records on read. 
> However, this mechanism may be by-passed on restore. Assume a 
> `builder.table()` call reads and drops a corrupted record. If the table state 
> is lost and restored from the changelog topic, the corrupted record may be 
> copied into the store, because on restore plain bytes are copied.
> If the KTable is used in a join, an internal `store.get()` call to lookup the 
> record would fail with a deserialization exception if the value part cannot 
> be deserialized.
> GlobalKTables are affected, too (cf. KAFKA-7663 that may allow a fix for 
> GlobalKTable case). It's unclear to me atm, how this issue could be addressed 
> for KTables though.
> Note, that user state stores are not affected, because they always have a 
> dedicated changelog topic (and don't reuse an input topic) and thus the 
> corrupted record would not be written into the changelog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to