yihua commented on code in PR #7528:
URL: https://github.com/apache/hudi/pull/7528#discussion_r1085670887
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieMergeOnReadRDD.scala:
##########
@@ -138,10 +137,16 @@ class HoodieMergeOnReadRDD(@transient sc: SparkContext,
override protected def getPartitions: Array[Partition] =
fileSplits.zipWithIndex.map(file => HoodieMergeOnReadPartition(file._2,
file._1)).toArray
- private def getConfig: Configuration = {
- val conf = confBroadcast.value.value
- CONFIG_INSTANTIATION_LOCK.synchronized {
- new Configuration(conf)
- }
+ private def getHadoopConf: Configuration = {
+ val conf = hadoopConfBroadcast.value.value
+ new Configuration(conf)
Review Comment:
Synced up offline. @alexeykudinkin and I are aligned in that we need to
clean up the legacy code and remove any unnecessary code path. For this
particular case, there could be a problem regarding the modification of the
hadoop conf returned by this function (check: `HoodieMergeOnReadRDD#compute` ->
`LogFileIterator.logRecords` -> `scanLog` -> `FSUtils.getFs` ->
`prepareHadoopConf` -> `conf.set`). That's likely why the lock is put in from
the beginning. So the change is going to be reverted in this PR and we'll
revisit this in a separate PR to be merged after 0.13.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]