dongjoon-hyun commented on a change in pull request #24018: [SPARK-23749][SQL]
Workaround built-in Hive api changes (phase 1)
URL: https://github.com/apache/spark/pull/24018#discussion_r264057222
##########
File path:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala
##########
@@ -253,6 +253,13 @@ private[hive] trait SaveAsHiveFile extends
DataWritingCommand {
dir
}
+ // HIVE-14259 removed FileUtils.isSubDir(). Adapted it from Hive 1.2's
FileUtils.isSubDir().
Review comment:
Hi, @wangyum , @felixcheung , @srowen , @gatorsmile , @cloud-fan , @dbtsai ,
@rxin .
In general, I have no objection on this PR's approach. But, I want to point
out that we are embedding old Hive bug into Spark source code repository
logically. This is one of the typical example I want to avoid. HIVE-14259
purposely replaced `public isSubDir` to `private isSubDir` because of the old
bug which path `/dir12` is considered as a subdirectory of `/dir1`. Recent
Apache Hive releases don't have this bug, only Spark will have now by Spark
code. After this PR, I'm worrying about the situation where the whole
responsibility falls into Apache Spark community.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]