vamsikarnika commented on code in PR #11817:
URL: https://github.com/apache/hudi/pull/11817#discussion_r1743542690
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/CloudObjectsSelectorCommon.java:
##########
@@ -316,6 +343,197 @@ private static Dataset<Row> coalesceOrRepartition(Dataset
dataset, int numPartit
return dataset;
}
+ private static boolean isCoalesceRequired(TypedProperties properties, Schema
sourceSchema) {
+ return getBooleanWithAltKeys(properties,
CloudSourceConfig.SPARK_DATASOURCE_READER_COALESCE_ALIAS_COLUMNS)
+ && Objects.nonNull(sourceSchema)
+ && hasFieldWithAliases(sourceSchema);
+ }
+
+ /**
+ * Recursively checks if an Avro schema or any of its nested fields contain
aliases.
+ *
+ * @param schema The Avro schema to check.
+ * @return True if the schema or any of its fields contain aliases, false
otherwise.
+ */
+ private static boolean hasFieldWithAliases(Schema schema) {
Review Comment:
No, this is called once per ingestion on the provided schema when reading
from the source.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]