[ https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833569#comment-16833569 ]
Dawid Wysakowicz edited comment on FLINK-9679 at 5/6/19 7:54 AM: ----------------------------------------------------------------- Hi [~phoenixjiangnan] *DISCLAIMER* I have not read the design document for [FLINK-12256] thoroughly, just to the extent to get the overall idea. I think it is somehow related and it could be used in most of the cases there are some caveats though. I think the idea behind the `SerializationSchema` was to always write the schema based on the topic that the record was actually written to (that was the only strategy when the PR was opened). Another problem I see is that this would imply that the `SerializationSchema` would bypass the `Catalog` interface for creating new entries in the Schema registry which I think is wrong. I agree we should probably synchronize this effort with the work around Schema Registry Catalog. What I would like to see in the document for Registry Catalog is more in-depth discussion about the mapping between {{topic <> subject}} both in case of reading and writing. Some problems that we should solve from top of my head: * what happens if the id in the record does not correspond to the subject name used from catalog, and how do we check for that * which part is responsible for creating entries in the catalog * how do we store information is the stream is append stream or a changelog * how do we define schema for key and value of a Kafka message? was (Author: dawidwys): Hi [~phoenixjiangnan] *DISCLAIMER* I have not read the design document for [FLINK-12256] thoroughly, just to get the overall idea. I think it is somehow related and it could be used in most of the cases there are some caveats though. I think the idea behind the `SerializationSchema` was to always write the schema based on the topic that the record was actually written to (that was the only strategy when the PR was opened). Another problem I see is that this would imply that the `SerializationSchema` would bypass the `Catalog` interface for creating new entries in the Schema registry which I think is wrong. I agree we should probably synchronize this effort with the work around Schema Registry Catalog. What I would like to see in the document for Registry Catalog is more in-depth discussion about the mapping between {{topic <> subject}} both in case of reading and writing. Some problems that we should solve from top of my head: * what happens if the id in the record does not correspond to the subject name used from catalog, and how do we check for that * which part is responsible for creating entries in the catalog * how do we store information is the stream is append stream or a changelog * how do we define schema for key and value of a Kafka message? > Implement ConfluentRegistryAvroSerializationSchema > -------------------------------------------------- > > Key: FLINK-9679 > URL: https://issues.apache.org/jira/browse/FLINK-9679 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Affects Versions: 1.6.0 > Reporter: Yazdan Shirvany > Assignee: Dominik WosiĆski > Priority: Major > Labels: pull-request-available > > Implement AvroSerializationSchema using Confluent Schema Registry -- This message was sent by Atlassian JIRA (v7.6.3#76005)