[ 
https://issues.apache.org/jira/browse/SPARK-49508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891015#comment-17891015
 ] 

Steve Loughran commented on SPARK-49508:
----------------------------------------

> hadoop aws only requires the use of aws-java-sdk-s3 and aws-java-sdk-dynamodb

also requires all the dependencies of these artifacts to be in sync with what 
the rest of spark ships. Including things like jackson

That's the reason hadoop-aws uses the shaded artifacts from aws. It's not "hey, 
someone might want to also AWS Ground Control to talk to satellite downlinks!" 
it is "if we upgrade the aws sdk then hbase fails. Or spark breaks"

if someone wants to help produce a "lean aws sdk" with only those core libs 
*and all shaded dependences* -we'd be happy to switch.

oh, and hadoop 3.4.0+ requires the v2 version library "bundle.jar". same 
problem, different filenames


> Optimized hadoop-aws dependency, aws-java-sdk-bundle jar is too large
> ---------------------------------------------------------------------
>
>                 Key: SPARK-49508
>                 URL: https://issues.apache.org/jira/browse/SPARK-49508
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 4.0.0, 3.5.2
>            Reporter: melin
>            Priority: Major
>         Attachments: image-2024-09-06-17-29-33-066.png
>
>
> aws-java-sdk-bundle jar is too large,The size of the spark image will 
> double。hadoop aws only requires the use of aws-java-sdk-s3 and 
> aws-java-sdk-dynamodb
>  
> {code:java}
> // code placeholder
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-aws</artifactId>
>     <version>${hadoop.version}</version>
>     <exclusions>
>         <exclusion>
>             <groupId>com.amazonaws</groupId>
>             <artifactId>aws-java-sdk-bundle</artifactId>
>         </exclusion>
>     </exclusions>
> </dependency>
> <dependency>
>     <groupId>com.amazonaws</groupId>
>     <artifactId>aws-java-sdk-s3</artifactId>
>     <version>${awssdk.v1.version}</version>
> </dependency>
> <dependency>
>     <groupId>com.amazonaws</groupId>
>     <artifactId>aws-java-sdk-dynamodb</artifactId>
>     <version>${awssdk.v1.version}</version>
> </dependency> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to