nsivabalan commented on code in PR #17777:
URL: https://github.com/apache/hudi/pull/17777#discussion_r2710306757
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##########
@@ -728,9 +729,10 @@ Pair<InputBatch, Boolean>
fetchNextBatchFromSource(Option<Checkpoint> resumeChec
} else {
// Deduce proper target (writer's) schema for the input dataset,
reconciling its
// schema w/ the table's one
- HoodieSchema incomingSchema = transformed.map(df ->
-
HoodieSchemaConversionUtils.convertStructTypeToHoodieSchema(df.schema(),
getRecordQualifiedName(cfg.targetTableName)))
-
.orElseGet(dataAndCheckpoint.getSchemaProvider()::getTargetHoodieSchema);
+ HoodieSchema incomingSchema = transformed.map(df -> {
+ StructType structType = UtilHelpers.extractSchemaFromDataset(df,
props);
Review Comment:
technically speaking, these two may not fit into same case.
i.e. fixing here and fixing RowSource for nullable Schema.
But may not be worth adding two configs. We can go w/ just 1 config only and
keep it in HoodieStreamerConfig.
and if need be you can add alternatives
"hoodie.streamer.transformed.row.nullable" with
"hoodie.deltastreamer.transformed.row.nullable" as alternative.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]