xkrogen commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r558493411



##########
File path: pom.xml
##########
@@ -2420,17 +2473,6 @@
                 </rules>
               </configuration>
             </execution>
-            <execution>
-              <id>enforce-no-duplicate-dependencies</id>
-              <goals>
-                <goal>enforce</goal>
-              </goals>
-              <configuration>
-                <rules>
-                  <banDuplicatePomDependencyVersions/>
-                </rules>
-              </configuration>
-            </execution>

Review comment:
       Is there any way to provide specific exclusions for the enforcement? 
It's a shame to have to completely turn off the rule for this.

##########
File path: project/SparkBuild.scala
##########
@@ -271,6 +271,9 @@ object SparkBuild extends PomBuild {
       DefaultMavenRepository,
       Resolver.mavenLocal,
       Resolver.file("ivyLocal", file(Path.userHome.absolutePath + 
"/.ivy2/local"))(Resolver.ivyStylePatterns)
+    ) ++ Seq(
+      "hive-staged-releases-mirror" at 
"https://repository.apache.org/content/repositories/staging/";,
+      Resolver.file("local", file(Path.userHome.absolutePath + 
"/.ivy2/local"))(Resolver.ivyStylePatterns)

Review comment:
       Why are we adding a duplicate resolver for the Ivy2 cache?

##########
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##########
@@ -112,11 +112,24 @@ private[hive] object IsolatedClientLoader extends Logging 
{
       hadoopVersion: String,
       ivyPath: Option[String],
       remoteRepos: String): Seq[URL] = {
+    val hadoopJarNames = if (hadoopVersion.startsWith("3")) {
+      Seq(s"org.apache.hadoop:hadoop-client-api:$hadoopVersion",
+        s"org.apache.hadoop:hadoop-client-runtime:$hadoopVersion")
+    } else {
+      Seq(s"org.apache.hadoop:hadoop-client:$hadoopVersion")
+    }
     val hiveArtifacts = version.extraDeps ++
       Seq("hive-metastore", "hive-exec", "hive-common", "hive-serde")
         .map(a => s"org.apache.hive:$a:${version.fullVersion}") ++
-      Seq("com.google.guava:guava:14.0.1",
-        s"org.apache.hadoop:hadoop-client:$hadoopVersion")
+      Seq("com.google.guava:guava:14.0.1") ++ hadoopJarNames
+
+    val extraExclusions = if (hadoopVersion.startsWith("3")) {
+      // this introduced from lower version of Hive could conflict with jars 
in Hadoop 3.2+, so
+      // exclude here in favor of the ones in Hadoop 3.2+
+      Seq("org.apache.hadoop:hadoop-auth")

Review comment:
       Doesn't Hive pull in other non-shaded Hadoop dependencies that could 
cause issues? 

##########
File path: project/SparkBuild.scala
##########
@@ -271,6 +271,9 @@ object SparkBuild extends PomBuild {
       DefaultMavenRepository,
       Resolver.mavenLocal,
       Resolver.file("ivyLocal", file(Path.userHome.absolutePath + 
"/.ivy2/local"))(Resolver.ivyStylePatterns)
+    ) ++ Seq(

Review comment:
       Why do we concatenate two `Seq` instead of just adding inside of the 
`Seq()` call above?

##########
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##########
@@ -112,11 +112,24 @@ private[hive] object IsolatedClientLoader extends Logging 
{
       hadoopVersion: String,
       ivyPath: Option[String],
       remoteRepos: String): Seq[URL] = {
+    val hadoopJarNames = if (hadoopVersion.startsWith("3")) {

Review comment:
       Will this break if `hadoopVersion` is 3.1.0, 3.2.1, etc. (due to the 
previous issues with the shaded client JARs)?

##########
File path: project/SparkBuild.scala
##########
@@ -271,6 +271,9 @@ object SparkBuild extends PomBuild {
       DefaultMavenRepository,
       Resolver.mavenLocal,
       Resolver.file("ivyLocal", file(Path.userHome.absolutePath + 
"/.ivy2/local"))(Resolver.ivyStylePatterns)
+    ) ++ Seq(
+      "hive-staged-releases-mirror" at 
"https://repository.apache.org/content/repositories/staging/";,

Review comment:
       Why do we need the staging repository (here and in `SQLConf`) ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to