advancedxy commented on a change in pull request #25616: [SPARK-28907][CORE]
Review invalid usage of new Configuration()
URL: https://github.com/apache/spark/pull/25616#discussion_r320109590
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileWholeTextReader.scala
##########
@@ -45,6 +45,7 @@ class HadoopFileWholeTextReader(file: PartitionedFile, conf:
Configuration)
val attemptId = new TaskAttemptID(new TaskID(new JobID(), TaskType.MAP,
0), 0)
val hadoopAttemptContext = new TaskAttemptContextImpl(conf, attemptId)
val reader = new WholeTextFileRecordReader(fileSplit,
hadoopAttemptContext, 0)
+ reader.setConf(hadoopAttemptContext.getConfiguration)
Review comment:
> I see, they're not failing in master but can fail if run in an env where
Hadoop config files are present?I see, they're not failing in master but can
fail if run in an env where Hadoop config files are present?
Eh, yes, they are not failing in master. The code(master) even normally
won't fail in an env where Hadoop configs are present. They could fail or get
unexpected result unless the Hadoop configs are incorrectly configured in
executor env(such as yarn-cluster), even user supplies correct configs (passed
to `TaskAttemptContext`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]