nsivabalan edited a comment on issue #2406:
URL: https://github.com/apache/hudi/issues/2406#issuecomment-778777681
looks like there could be a bug. Here is the reason:
Deltastreamer works fine for Dataset \<Row\> sources w/o providing schema
provider. But looks like in multi table delta streamer we missed to hold on to
that assumption. Tests written for multi table delta streamer are for Dataset
\<GenericRecord\> and hence schema providers are mandatory.
```
git diff
diff --git
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
index 9d5ca3ca..91742ec0 100644
---
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
+++
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
@@ -147,7 +147,7 @@ public class HoodieMultiTableDeltaStreamer {
}
private void populateSchemaProviderProps(HoodieDeltaStreamer.Config cfg,
TypedProperties typedProperties) {
- if
(cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {
+ if (cfg.schemaProviderClassName != null &&
cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {
String schemaRegistryBaseUrl =
typedProperties.getString(Constants.SCHEMA_REGISTRY_BASE_URL_PROP);
String schemaRegistrySuffix =
typedProperties.getString(Constants.SCHEMA_REGISTRY_URL_SUFFIX_PROP);
typedProperties.setProperty(Constants.SOURCE_SCHEMA_REGISTRY_URL_PROP,
schemaRegistryBaseUrl + typedProperties.getString(Constants.KAFKA_TOPIC_PROP) +
schemaRegistrySuffix);
```
As you might have figured out, I don't have exp with this code base before.
So, will have to write tests to ensure the fix works. But in the mean time, if
you have access to StructType (schema), then you can try using
[RowBasedSchemaProvider](https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/RowBasedSchemaProvider.java
) to unblock for now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]