dongjoon-hyun edited a comment on pull request #29895: URL: https://github.com/apache/spark/pull/29895#issuecomment-700821012
Hi, @steveloughran and @tgravescs . No matter what happens in the future, they cannot change the history (Apache Hadoop 3.2.0 and all exiting Hadoop 3.x versions). And, for now, Apache Spark 3.1 will be stuck in Apache Hadoop 3.2.0 due to the Guava issue. That's the reason why we need to do this right now from Spark side. For the following, @steveloughran , as I wrote in the PR description, this PR doesn't override the explicit user-give config. This is only setting `v1` when there is no explicit setting. > V2 is used in places where people have hit the scale limits with v1, and they are happy with the risk of failures. Eventually, I believe we can use `hadoop-client-runtime` only in order to remove guava dependency (#29843) and consume @steveloughran 's new Hadoop release in the future. Until that time, Apache Spark 3.1 had better provide a no-known-correctness-regression migration. If Apache Spark 3.1 default distribution is unsafe due to the 3rd party (in this case Hadoop), how can we recommend this to the users? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
