rahil-c commented on code in PR #13756:
URL: https://github.com/apache/hudi/pull/13756#discussion_r2296839504


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedFileFormat.scala:
##########
@@ -211,17 +211,19 @@ class HoodieFileGroupReaderBasedFileFormat(tablePath: 
String,
     val engineContext = new HoodieSparkEngineContext(new 
JavaSparkContext(spark.sparkContext))
     val maxMemoryPerCompaction = 
IOUtils.getMaxMemoryPerCompaction(engineContext.getTaskContextSupplier, 
options.asJava)
 
+    // Create metaclient on driver to avoid expensive operations on executors
+    val storageConf = new 
HadoopStorageConfiguration(broadcastedStorageConf.value.value)

Review Comment:
   @yihua Have pushed a commit to try using the `augmentedStorageConf`. I 
wonder thought if this is safe as I thought the reason of using this `val 
broadcastedStorageConf = spark.sparkContext.broadcast(new 
SerializableConfiguration(augmentedStorageConf.unwrap()))` was because of the 
fact that the `augmentedStorageConf` may not be serializable, due to its hadoop 
`Configuration` usage which does not implement .
   
   If there is an issue with the recent commit then i think i might be good to 
switch back to what it was before.
   
   



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedFileFormat.scala:
##########
@@ -211,17 +211,19 @@ class HoodieFileGroupReaderBasedFileFormat(tablePath: 
String,
     val engineContext = new HoodieSparkEngineContext(new 
JavaSparkContext(spark.sparkContext))
     val maxMemoryPerCompaction = 
IOUtils.getMaxMemoryPerCompaction(engineContext.getTaskContextSupplier, 
options.asJava)
 
+    // Create metaclient on driver to avoid expensive operations on executors
+    val storageConf = new 
HadoopStorageConfiguration(broadcastedStorageConf.value.value)
+    val metaClient: HoodieTableMetaClient = HoodieTableMetaClient
+      .builder().setConf(storageConf).setBasePath(tablePath).build

Review Comment:
   Let me make a follow up JIRA?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to