srowen commented on a change in pull request #25559: [WIP][DO-NOT-MERGE] Test
updating Kinesis deps and current state of Kinesis Python tests
URL: https://github.com/apache/spark/pull/25559#discussion_r318321559
##########
File path: hadoop-cloud/pom.xml
##########
@@ -100,6 +100,10 @@
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</exclusion>
+ <exclusion>
Review comment:
Thanks @steveloughran -- so, given that we are for better or worse here
still on Hadoop 2.7 (because I think I need to back port this to 2.4 at least),
is it safe to exclude the whole aws-java-sdk dependency? doesn't seem so as it
would mean the user has to re-include it. But is it safe to assume they would
be running this on Hadoop anyway?
Sounds like you are saying that in Hadoop 2.9, this dependency wouldn't
exist or could be excluded.
So, excluding it definitely worked to solve the problem. Right now I'm
seeing what happens if we explicitly manage its version up _as a direct
dependency_ because just managing it up with `<dependencyManagement>` wasn't
enough. The downside is probably that the assembly brings in everything the
`aws-java-sdk` depends on, which is a lot of stuff. We don't distribute the
assembly per se (right?) so it doesn't really mean more careful checks of the
license of all the dependencies.
Still, if somehow it were fine to exclude this dependency, that's the
tidiest from Spark's perspective. Does that fly for Hadoop 2.7 or pretty well
break the point of `hadoop-cloud`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]