Re: [PR] [HUDI-8759] Add virtual key support to file group reader [hudi]

via GitHub Sat, 26 Apr 2025 15:23:07 -0700


the-other-tim-brown commented on code in PR #13208:
URL: https://github.com/apache/hudi/pull/13208#discussion_r2061522695



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/SparkReaderContextFactory.java:
##########
@@ -88,6 +90,9 @@ class SparkReaderContextFactory implements 
ReaderContextFactory<InternalRow> {
     // Spark parquet reader has to be instantiated on the driver and broadcast 
to the executors
     SparkParquetReader parquetFileReader = 
sparkAdapter.createParquetFileReader(false, sqlConf, options, configs);
     parquetReaderBroadcast = jsc.broadcast(parquetFileReader);
+    // Broadcast: TableConfig.
+    HoodieTableConfig tableConfig = metaClient.getTableConfig();

Review Comment:
   The schema evolution should not be in the hadoop configuration to begin with 
in my opinion. It is not related to spark or hadoop, it is part of the Hudi 
logic therefore should be in some configuration for the Hudi reader.
   
   It also should not be overriding the user provided configuration like it 
does today. It limits the user's ability to customize their job and read their 
data after upgrading. 
   
   I think we generally will need to rethink this approach but is it a blocker 
for this PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8759] Add virtual key support to file group reader [hudi]

Reply via email to