[
https://issues.apache.org/jira/browse/HUDI-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489796#comment-17489796
]
Pratyaksh Sharma commented on HUDI-3264:
----------------------------------------
The best and the easiest possible fix is to allow users to set the schema
registry urls similar to how it is done with simple HoodieDeltaStreamer
instance, side by side maintaining the current configs so as to not break the
existing functionality.
> Make schema registry configs more flexible with MultiTableDeltaStreamer
> -----------------------------------------------------------------------
>
> Key: HUDI-3264
> URL: https://issues.apache.org/jira/browse/HUDI-3264
> Project: Apache Hudi
> Issue Type: Task
> Components: deltastreamer
> Reporter: sivabalan narayanan
> Assignee: Pratyaksh Sharma
> Priority: Major
> Labels: sev:normal
> Fix For: 0.11.0
>
>
> Ref issue: [https://github.com/apache/hudi/issues/4585]
> Hi guys,
> we ran into a problem setting the target schema of our Hudi table using the
> MultiTableDeltaStreamer.
> Using a normal DeltaStreamer, we are able to set our source and target
> schemas using the properties:
> * hoodie.deltastreamer.schemaprovider.registry.url
> * hoodie.deltastreamer.schemaprovider.registry.targetUrl
> We found that we are not able to set these properties on a table basis using
> the MultiTableDeltaStreamer, since the MTDS builds SchemaRegistry URLs for
> target and source schema using the properties:
> * hoodie.deltastreamer.schemaprovider.registry.baseUrl
> * hoodie.deltastreamer.schemaprovider.registry.sourceUrlSuffix
> * hoodie.deltastreamer.schemaprovider.registry.targetUrlSuffix
> Later the MultiTableDeltaStreamer uses the source Kafka Topic name also for
> setting the name of the target schema:
>
> [hudi/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java|https://github.com/apache/hudi/blob/9fe28e56b49c7bf68ae2d83bfe89755314aa793b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java#L167]
> Line 167 in
> [9fe28e5|https://github.com/apache/hudi/commit/9fe28e56b49c7bf68ae2d83bfe89755314aa793b]
> ||typedProperties.setProperty(Constants.TARGET_SCHEMA_REGISTRY_URL_PROP,
> schemaRegistryBaseUrl + typedProperties.getString(Constants.KAFKA_TOPIC_PROP)
> + targetSchemaRegistrySuffix);|
>
> We think, that schema names should be more configurable, like the origin
> DeltaStreamer would handle it. Actually the names of the schemas you want to
> use for reading or writing the data are very tight coupled to the name of the
> Kafka topic the data is loaded from.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)