dongjoon-hyun commented on a change in pull request #24018: [SPARK-23749][SQL] 
Workaround built-in Hive api changes (phase 1)
URL: https://github.com/apache/spark/pull/24018#discussion_r264057222
 
 

 ##########
 File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala
 ##########
 @@ -253,6 +253,13 @@ private[hive] trait SaveAsHiveFile extends 
DataWritingCommand {
     dir
   }
 
+  // HIVE-14259 removed FileUtils.isSubDir(). Adapted it from Hive 1.2's 
FileUtils.isSubDir().
 
 Review comment:
   Hi, @wangyum , @felixcheung , @srowen , @gatorsmile , @cloud-fan , @dbtsai , 
@rxin .
   
   In general, I have no objection on this PR's approach. But, I want to point 
out that we are embedding old Hive bug into Spark source code repository 
logically. This is one of the typical example I want to avoid. HIVE-14259 
purposely replaced `public isSubDir` to `private isSubDir` because of the old 
bug which path `/dir12` is considered as a subdirectory of `/dir1`. Recent 
Apache Hive releases don't have this bug, only Spark will have now by Spark 
code. After this PR, I'm worrying about the situation where the whole 
responsibility falls into Apache Spark community.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to