[
https://issues.apache.org/jira/browse/HADOOP-19272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883010#comment-17883010
]
ASF GitHub Bot commented on HADOOP-19272:
-----------------------------------------
steveloughran commented on PR #7048:
URL: https://github.com/apache/hadoop/pull/7048#issuecomment-2360899909
> And then testNoisyLogging will start failing. So then do we revert or fix
the test?
the test failure will be how we verify that all is good. at which point we
can modify the test to fail if any warnings come from the manager.
> I think we want to disable transfer manager logs permanently, so it'll
just be fixing the test.
I'm going to propose moving off it entirely
* too opaque
* different client hurts startup and teardown. lazy start is insufficient
* using the existing thread an http pools is more efficient and should be
more responsive
* we can't add audit headers to all requests
* for If-none-match headers, doing it in our own code ensures it is present
*and* we can generate genuine conflicts of file deletion and creation during a
multipart rename.
also I no longer trust the code.
> S3A: AWS SDK 2.25.53 warnings logged about transfer manager not using CRT
> client
> --------------------------------------------------------------------------------
>
> Key: HADOOP-19272
> URL: https://issues.apache.org/jira/browse/HADOOP-19272
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.4.0, 3.5.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
> Attachments: output.txt
>
>
> When an S3 transfer manager is created for renaming/download a new message is
> logged telling off the caller for not using the CRT client.
> {code}
> 5645:2024-09-13 16:29:17,375 [setup] WARN s3.S3TransferManager
> (LoggerAdapter.java:warn(225)) - The provided S3AsyncClient is an instance of
> MultipartS3AsyncClient, and thus multipart download feature is not enabled.
> To benefit from all features, consider using
> S3AsyncClient.crtBuilder().build() instead.
> {code}
> This is a change in the SDK to tell us developers off -yet it is visible to
> end users who don't benefit from it and for which it only creates confusion.
> It appears to have been downgraded to debug in the AWS trunk code in PR "S3
> Async Client - Multipart download (#5164) -but:
> * it is too late to upgrade and qualify a new version for 3.4.1; downgrading
> is all we can do
> * there is no guarantee this log message or similar will reoccur.
> Plan
> 1. Revert from 3.4.1
> 2. lift code from cloudstore library which uses reflection to access and
> manipulate log4j logs where present
> 3. downgrade all transfer manager log levels to NONE.
> 4. File an AWS report about how this is an incompatible regression, identify
> how their process can evolve, particularly in the area of code guidelines
> about safe logging use.
> I also intend to tighten up our review process to support more rigorous
> detection of new .warn() messages in the AWS SDK. I'm going to propose that
> as well as requiring review of our test/CLI output, we require ripgrep scans
> of .warn(/.error( in SDK source, audit of any new changes. by saving the
> output of the previous iteration, it'll be straightforward to identify new
> changes -but not changes in codepaths which change their frequency of
> appearance.
> I think we should revisit whether or not to move off the xfer manager in the
> past. We've discussed it in the past, and avoided it just due to maintenance
> costs. However, it is pushing maintenance costs anyway.
> meanwhile: no new AWS SDK updates until we are confident we have our
> processes under control.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]