JoshRosen commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715946127
##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
<curator.version>2.7.1</curator.version>
<commons-io.version>2.4</commons-io.version>
<hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-
<hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+
<hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>
Review comment:
Ahhh, this is a clever fix:
Instead of the `hadoop-2.7` profile resulting in a duplicate direct
dependency on `hadoop-client`, we now just declare an explicit dependency on
one of `hadoop-client`'s transitive dependencies (`hadoop-yarn-api` in this
case). Anything which depends on `hadoop-client-runtime.artifact` must also
depend on `hadoop-client-api.artifact`, so this doesn't end up changing the set
of dependencies pulled in.
It looks like we didn't need to do that for
`hadoop-client-minicluster.artifact` because that's only used in the
`resource-managers/yarn` POM and that's already using Maven profiles to control
the dependency selection (so the other workaround is less invasive in that
context). In principle, though, I guess we could have changed that to some
other transitive dep.
---
Could you maybe add a one or two line comment above these Hadoop 2.7 lines
to explain what's going on? And maybe edit the comment at
https://github.com/apache/spark/blob/d73562ed3635bb3454ac67029ca6541b30ae0c02/pom.xml#L251-L255
to reflect this change? This fix is clever but a little subtle, so I think a
comment calling it out (and maybe mentioning SPARK-36835) might help future
readers.
**Edit:** could you also update the PR description to reflect this final
fix?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]