bvaradar commented on code in PR #14364:
URL: https://github.com/apache/hudi/pull/14364#discussion_r2586833435


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##########
@@ -337,7 +338,7 @@ public StreamSync(HoodieStreamer.Config cfg, SparkSession 
sparkSession,
     Source source = UtilHelpers.createSource(cfg.sourceClassName, props, 
hoodieSparkContext.jsc(), sparkSession, metrics, streamContext);
     this.formatAdapter = new SourceFormatAdapter(source, 
this.errorTableWriter, Option.of(props));
 
-    Supplier<Option<Schema>> schemaSupplier = schemaProvider == null ? 
Option::empty : () -> Option.ofNullable(schemaProvider.getSourceSchema());
+    Supplier<Option<Schema>> schemaSupplier = schemaProvider == null ? 
Option::empty : () -> 
Option.ofNullable(schemaProvider.getSourceHoodieSchema()).map(HoodieSchema::toAvroSchema);

Review Comment:
   I guess changing schemaSupplier to Supplier<Option<HoodieSchema>>  is in 
subsequent PR.



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java:
##########
@@ -70,31 +70,31 @@ public FilebasedSchemaProvider(TypedProperties props, 
JavaSparkContext jssc) {
     }
   }
 
-  private Schema parseSchema(String schemaFile) {
+  private HoodieSchema parseSchema(String schemaFile) {
     return readSchemaFromFile(schemaFile, this.fs, config);
   }
 
   @Override
-  public Schema getSourceSchema() {
+  public HoodieSchema getSourceHoodieSchema() {

Review Comment:
   While, we are the handling external implementation of SchemaProvider 
correctly, we are not preventing breakages when users extend from these 
concrete implementations which unfortunately is present.  If users extend 
FilebasedSchemaProvider and override "public Schema getTargetSchema", then 
their implementation will silently not be used. 
   
   I guess it is safe to  retain the existing implementation of  
getSourceSchema and getTargetSchema in each implementation of SchemaProvider 
with deprecated flag and remove in subsequent release.  
   
   



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaPostProcessor.java:
##########
@@ -59,5 +59,5 @@ protected SchemaPostProcessor(TypedProperties props, 
JavaSparkContext jssc) {
    * @param schema input schema.
    * @return modified schema.
    */
-  public abstract Schema processSchema(Schema schema);

Review Comment:
   Custom SchemaPostProcessor implementation is  plugged in through 
UtilHelpers.createSchemaPostProcessor. Can you do the same deprecation + 
delegation trick that you did in SchemaProvider here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to