mridulm commented on code in PR #40945:
URL: https://github.com/apache/spark/pull/40945#discussion_r1191863241
##########
core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala:
##########
@@ -623,15 +623,13 @@ private[spark] object SparkHadoopUtil extends Logging {
fs.create(path)
} else {
try {
- // Use reflection as this uses APIs only available in Hadoop 3
- val builderMethod = fs.getClass().getMethod("createFile",
classOf[Path])
// the builder api does not resolve relative paths, nor does it create
parent dirs, while
// the old api does.
if (!fs.mkdirs(path.getParent())) {
throw new IOException(s"Failed to create parents of $path")
}
val qualifiedPath = fs.makeQualified(path)
- val builder = builderMethod.invoke(fs, qualifiedPath)
+ val builder = fs.createFile(qualifiedPath)
val builderCls = builder.getClass()
// this may throw a NoSuchMethodException if the path is not on hdfs
val replicateMethod = builderCls.getMethod("replicate")
Review Comment:
The intent here is not to invoke any random `replicate` method which might
be sitting inside the `FSDataOutputStreamBuilder` - but to do so specifically
for hdfs `HdfsDataOutputStreamBuilder`/subclasses.
Unfortunately, this is not yet resolved (see HDFS-14038).
See [this
discussion](https://github.com/apache/spark/pull/22881#discussion_r229102457)
for more.
As part of this fix, if there are other implementations which get broken due
to this change, we should address them appropriately on a case by case basis or
get hdfs team to fix their interfaces : not keep a broken contract.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]