[
https://issues.apache.org/jira/browse/SPARK-49508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891015#comment-17891015
]
Steve Loughran commented on SPARK-49508:
----------------------------------------
> hadoop aws only requires the use of aws-java-sdk-s3 and aws-java-sdk-dynamodb
also requires all the dependencies of these artifacts to be in sync with what
the rest of spark ships. Including things like jackson
That's the reason hadoop-aws uses the shaded artifacts from aws. It's not "hey,
someone might want to also AWS Ground Control to talk to satellite downlinks!"
it is "if we upgrade the aws sdk then hbase fails. Or spark breaks"
if someone wants to help produce a "lean aws sdk" with only those core libs
*and all shaded dependences* -we'd be happy to switch.
oh, and hadoop 3.4.0+ requires the v2 version library "bundle.jar". same
problem, different filenames
> Optimized hadoop-aws dependency, aws-java-sdk-bundle jar is too large
> ---------------------------------------------------------------------
>
> Key: SPARK-49508
> URL: https://issues.apache.org/jira/browse/SPARK-49508
> Project: Spark
> Issue Type: Improvement
> Components: Build
> Affects Versions: 4.0.0, 3.5.2
> Reporter: melin
> Priority: Major
> Attachments: image-2024-09-06-17-29-33-066.png
>
>
> aws-java-sdk-bundle jar is too large,The size of the spark image will
> double。hadoop aws only requires the use of aws-java-sdk-s3 and
> aws-java-sdk-dynamodb
>
> {code:java}
> // code placeholder
> <dependency>
> <groupId>org.apache.hadoop</groupId>
> <artifactId>hadoop-aws</artifactId>
> <version>${hadoop.version}</version>
> <exclusions>
> <exclusion>
> <groupId>com.amazonaws</groupId>
> <artifactId>aws-java-sdk-bundle</artifactId>
> </exclusion>
> </exclusions>
> </dependency>
> <dependency>
> <groupId>com.amazonaws</groupId>
> <artifactId>aws-java-sdk-s3</artifactId>
> <version>${awssdk.v1.version}</version>
> </dependency>
> <dependency>
> <groupId>com.amazonaws</groupId>
> <artifactId>aws-java-sdk-dynamodb</artifactId>
> <version>${awssdk.v1.version}</version>
> </dependency> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]