[
https://issues.apache.org/jira/browse/HADOOP-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243994#comment-17243994
]
Steve Loughran commented on HADOOP-17337:
-----------------------------------------
Checking up on this. I'd like to make this a blocker for 3.3.1, as it is bad
news for lightweight docker deployments.
# [~cwensel] do you know anyone who can work on this
# what do people think would be the way to do it?
I'm thinking we'd need a PatchTheSocketFactory interface, try to load an
implementation which does this to the shaded one; if that doesn't load fall
back to one to patch the unshaded one. That way even unshaded aws-s3 libraries
would be able to switch to openssl
> NetworkBinding has a runtime class dependency on a third-party shaded class
> ---------------------------------------------------------------------------
>
> Key: HADOOP-17337
> URL: https://issues.apache.org/jira/browse/HADOOP-17337
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Chris Wensel
> Priority: Blocker
> Fix For: 3.3.1
>
>
> The hadoop-aws library has a dependency on
> 'com.amazonaws':aws-java-sdk-bundle' which in turn is a fat jar of all AWS
> SDK libraries and shaded dependencies.
>
> This dependency is 181MB.
>
> Some applications using the S3AFilesystem may be sensitive to having a large
> footprint. For example, building an application using Parquet and bundled
> with Docker.
>
> Typically, in prior Hadoop versions, the bundle was replaced by the specific
> AWS SDK dependencies, dropping the overall footprint.
>
> In 3.3 (and maybe prior versions) this strategy does not work because of the
> following exception:
> {{java.lang.NoClassDefFoundError:
> com/amazonaws/thirdparty/apache/http/conn/socket/ConnectionSocketFactory}}
> {{ at
> org.apache.hadoop.fs.s3a.S3AUtils.initProtocolSettings(S3AUtils.java:1335)}}
> {{ at
> org.apache.hadoop.fs.s3a.S3AUtils.initConnectionSettings(S3AUtils.java:1290)}}
> {{ at org.apache.hadoop.fs.s3a.S3AUtils.createAwsConf(S3AUtils.java:1247)}}
> {{ at
> org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:61)}}
> {{ at
> org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:644)}}
> {{ at
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:390)}}
> {{ at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3414)}}
> {{ at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:158)}}
> {{ at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3474)}}
> {{ at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3442)}}
> {{ at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:524)}}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]