[
https://issues.apache.org/jira/browse/HADOOP-19205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859635#comment-17859635
]
ASF GitHub Bot commented on HADOOP-19205:
-----------------------------------------
steveloughran commented on PR #6892:
URL: https://github.com/apache/hadoop/pull/6892#issuecomment-2186099426
somef more changes planned in the `LazyAtomicReference` class.
* implement the Supplier interface, which invokes `getUnchecked()`
* plus static fromSupplier() constructor
implementing supplier lets you something quite special with this: you can
feed it in to any java api which takes a supplier, and, if the function
supplied in the constructor is side-effect free & not dependent on external
state, act as an on-demand cache of a function.
Surprised this isn't in the JDK already, either the team aren't aware of
parallel graph reduction systems, or aware enough of the 1980s designs (e.g.
ALICE) that they chose to avoid. Things [are
changing](https://www-users.york.ac.uk/~mt540/graceful-ws/slides/Stewart.pdf).
> S3A initialization/close slower than with v1 SDK
> ------------------------------------------------
>
> Key: HADOOP-19205
> URL: https://issues.apache.org/jira/browse/HADOOP-19205
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Priority: Major
> Labels: pull-request-available
> Attachments: Screenshot 2024-06-14 at 17.12.59.png, Screenshot
> 2024-06-14 at 17.14.33.png
>
>
> Hive QE have observed slowdown in LLAP queries due to time to create and
> close s3a filesystems instances. A key aspect of that is they keep closing
> the fs instances (HIVE-27884), but looking at the profiles, the reason things
> seem to have regressed is
> * two s3 clients are being created (sync and async)
> * these seem to take a lot of time scanning the classpath for "global
> interceptors", which is at least an O(jars) operation; #of index entries in
> the zip files may factor too.
> Proposed:
> * create async client on demand when the transfer manager is invoked
> * look at why passwords are being scanned for if
> InstanceProfileCredentialsProvider is in use...that seems slow too
> SDK wishes
> * SDK maybe allow us to turn off that scan for interceptors?
> attaching screenshots of the profile. storediag snippet:
> {code}
> [001] fs.s3a.access.key = (unset)
> [002] fs.s3a.secret.key = (unset)
> [003] fs.s3a.session.token = (unset)
> [004] fs.s3a.server-side-encryption-algorithm = (unset)
> [005] fs.s3a.server-side-encryption.key = (unset)
> [006] fs.s3a.encryption.algorithm = (unset)
> [007] fs.s3a.encryption.key = (unset)
> [008] fs.s3a.aws.credentials.provider =
> "com.amazonaws.auth.InstanceProfileCredentialsProvider" [core-site.xml]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]