mridulm commented on code in PR #40945:
URL: https://github.com/apache/spark/pull/40945#discussion_r1191804004


##########
core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala:
##########
@@ -623,15 +623,13 @@ private[spark] object SparkHadoopUtil extends Logging {
       fs.create(path)
     } else {
       try {
-        // Use reflection as this uses APIs only available in Hadoop 3
-        val builderMethod = fs.getClass().getMethod("createFile", 
classOf[Path])
         // the builder api does not resolve relative paths, nor does it create 
parent dirs, while
         // the old api does.
         if (!fs.mkdirs(path.getParent())) {
           throw new IOException(s"Failed to create parents of $path")
         }
         val qualifiedPath = fs.makeQualified(path)
-        val builder = builderMethod.invoke(fs, qualifiedPath)
+        val builder = fs.createFile(qualifiedPath)
         val builderCls = builder.getClass()
         // this may throw a NoSuchMethodException if the path is not on hdfs
         val replicateMethod = builderCls.getMethod("replicate")

Review Comment:
   QQ: If this is specific to `HdfsDataOutputStreamBuilder` and we are 
expecting to invoke `replicate` on that class/sub-classes, why not tighten the 
code to do it specifically for that class (and its subclasses) ?
   With hadoop 2.x, I can understand the generic replicate method invocation - 
but we have the opportunity to be more nuanced now.
   
   I am fine with move that snippet to a different PR though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to