sunchao commented on pull request #29843: URL: https://github.com/apache/spark/pull/29843#issuecomment-735284760
Thanks @steveloughran , yeah agree that the java 9 modules feature looks promising (it was discussed some years back in [HADOOP-11656](https://issues.apache.org/jira/browse/HADOOP-11656) but now the timing should be more right). I can try to spend sometime looking at this. This will take a while though. In the meanwhile, I'm wondering what we can do to ship this in the soon-coming Spark 3.1 release. One possible solution, maybe, is to still use non-shaded client when the `hadoop-cloud` profile is picked up: ```diff diff --git a/pom.xml b/pom.xml index 3ae2e7420e..12c36af557 100644 --- a/pom.xml +++ b/pom.xml @@ -3238,6 +3238,11 @@ <profile> <id>hadoop-cloud</id> + <properties> + <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact> + <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact> + <hadoop-client-minicluster.artifact>hadoop-client</hadoop-client-minicluster.artifact> + </properties> <modules> <module>hadoop-cloud</module> </modules> ``` Or maybe we could try to shade the `hadoop-aws` jar in the `spark-hadoop-cloud_2.12` module itself so that it invokes the shaded API from `hadoop-common` side. This won't work if Spark users decide to use their own Hadoop jars (via `hadoop-provided`) so we may have to make the `hadoop-aws` a compile scope dependency. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
