[
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833569#comment-16833569
]
Dawid Wysakowicz edited comment on FLINK-9679 at 5/6/19 7:54 AM:
-----------------------------------------------------------------
Hi [~phoenixjiangnan]
*DISCLAIMER* I have not read the design document for [FLINK-12256] thoroughly,
just to the extent to get the overall idea.
I think it is somehow related and it could be used in most of the cases there
are some caveats though. I think the idea behind the `SerializationSchema` was
to always write the schema based on the topic that the record was actually
written to (that was the only strategy when the PR was opened). Another problem
I see is that this would imply that the `SerializationSchema` would bypass the
`Catalog` interface for creating new entries in the Schema registry which I
think is wrong.
I agree we should probably synchronize this effort with the work around Schema
Registry Catalog. What I would like to see in the document for Registry Catalog
is more in-depth discussion about the mapping between {{topic <> subject}} both
in case of reading and writing. Some problems that we should solve from top of
my head:
* what happens if the id in the record does not correspond to the subject name
used from catalog, and how do we check for that
* which part is responsible for creating entries in the catalog
* how do we store information is the stream is append stream or a changelog
* how do we define schema for key and value of a Kafka message?
was (Author: dawidwys):
Hi [~phoenixjiangnan]
*DISCLAIMER* I have not read the design document for [FLINK-12256] thoroughly,
just to get the overall idea.
I think it is somehow related and it could be used in most of the cases there
are some caveats though. I think the idea behind the `SerializationSchema` was
to always write the schema based on the topic that the record was actually
written to (that was the only strategy when the PR was opened). Another problem
I see is that this would imply that the `SerializationSchema` would bypass the
`Catalog` interface for creating new entries in the Schema registry which I
think is wrong.
I agree we should probably synchronize this effort with the work around Schema
Registry Catalog. What I would like to see in the document for Registry Catalog
is more in-depth discussion about the mapping between {{topic <> subject}} both
in case of reading and writing. Some problems that we should solve from top of
my head:
* what happens if the id in the record does not correspond to the subject name
used from catalog, and how do we check for that
* which part is responsible for creating entries in the catalog
* how do we store information is the stream is append stream or a changelog
* how do we define schema for key and value of a Kafka message?
> Implement ConfluentRegistryAvroSerializationSchema
> --------------------------------------------------
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
> Issue Type: Improvement
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
> Affects Versions: 1.6.0
> Reporter: Yazdan Shirvany
> Assignee: Dominik WosiĆski
> Priority: Major
> Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)