sunchao commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r558522227
##########
File path:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##########
@@ -112,11 +112,24 @@ private[hive] object IsolatedClientLoader extends Logging
{
hadoopVersion: String,
ivyPath: Option[String],
remoteRepos: String): Seq[URL] = {
+ val hadoopJarNames = if (hadoopVersion.startsWith("3")) {
Review comment:
Do you mean
[HADOOP-16080](https://issues.apache.org/jira/browse/HADOOP-16080)? yes things
could still break in the following cases.
1. users build Spark without `-Phadoop-cloud` AND use a version doesn't have
the fix in HADOOP-16080, such as:
```
$ bin/spark-shell --packages
org.apache.hadoop:hadoop-aws:3.2.0,org.apache.hadoop:hadoop-common:3.2.0
```
However I think we should recommend users to stick to the same version used
by Spark, i.e., 3.2.2
2. users build Spark with custom Hadoop version such as 3.1.0/3.2.1 you
mentioned via the `hadoop.version` property, and use this to talk to cloud
storage like S3.
To enable these use cases we may have to introduce another Maven property to
switch back to non-shaded client, and update here as well.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]