[ 
https://issues.apache.org/jira/browse/IMPALA-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat updated IMPALA-11125:
------------------------------------
    Priority: Critical  (was: Major)

> Revisit the minimal-s3a-aws-sdk jar
> -----------------------------------
>
>                 Key: IMPALA-11125
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11125
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 4.1.0
>            Reporter: Joe McDonnell
>            Priority: Critical
>
> The impala-minimal-s3a-aws-sdk jar takes the com.amazonaws 
> aws-java-sdk-bundle and filters out a bunch of unneeded items. With these 
> changes, the jar goes from 183MB to 89MB.
> When unpacking it, it looks like we still have some content that can be 
> removed. There are some services that we don't use (which may not have been 
> there when we first did this):
> {noformat}
> $ ls com/amazonaws/services | wc -l
> 116
> $ ls com/amazonaws/services
> accessanalyzer
> acmpca
> apigatewaymanagementapi
> appconfig
> appflow
> applicationinsights
> appregistry
> augmentedairuntime
> ...{noformat}
> Separately, the models directory takes up a lot of space:
> {noformat}
> $ du -ch models
> 807M    models
> 807M    total
> $ ls models | wc -l
> 468
> $ ls models
> a4b-2017-11-09-intermediate.json
> a4b-2017-11-09-model.json
> ...{noformat}
> These are json files that compress well, but nonetheless, they take up space.
> We should either revisit our exclusions and try to avoid packaging some of 
> these models, or we should try to avoid using aws-java-sdk-bundle and instead 
> pick out individual jars like aws-java-sdk-s3 and aws-java-sdk-dynamodb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to