[
https://issues.apache.org/jira/browse/HADOOP-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607665#comment-17607665
]
Steve Loughran edited comment on HADOOP-18183 at 9/21/22 10:24 AM:
-------------------------------------------------------------------
of the two-path operations, copy is the only one which exists, and that is all
done in the xfer manager. even if we did do it ourselves, it is only in the
source path where the range is read, isn't it?
so how about the p1rs and p1re are used, for now...if ever we needed more than
one, then that could go in later using the same scheme.
Using short names is important as we still don't know what that range limit in
the log entries are
actually, maybe we should have the range header in the audit log, rather than
start and end? that way if multiple ranges in a GET are supported, vectorIO can
take advantage of it and there is no need to make any changes in the auditing.
similarly, a range like "100-" to the EOF should be recordable.
was (Author: [email protected]):
of the two-path operations, copy is the only one which exists, and that is all
done in the xfer manager. even if we did do it ourselves, it is only in the
source path where the range is read, isn't it?
so how about the p1rs and p1re are used, for now...if ever we needed more than
one, then that could go in later using the same scheme.
Using short names is important as we still don't know what that range limit in
the log entries areldl
actually, maybe we should have the range header in the audit log, rather than
start and end? that way if multiple ranges in a GET are supported, vectorIO can
take advantage of it and there is no need to make any changes in the auditing.
similarly, a range like "100-" to the EOF should be recordable.
> s3a audit logs to publish range start/end of GET requests in audit header
> -------------------------------------------------------------------------
>
> Key: HADOOP-18183
> URL: https://issues.apache.org/jira/browse/HADOOP-18183
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.2
> Reporter: Steve Loughran
> Priority: Minor
>
> we don't get the range of ranged get requests in s3 server logs, because the
> AWS s3 log doesn't record that information. we can see it's a partial get
> from the 206 response, but the length of data retrieved is lost.
> LoggingAuditor.beforeExecution() would need to recognise a ranged GET and
> determine the extra key-val pairs for range start and end (rs & re?)
> we might need to modify {{HttpReferrerAuditHeader.buildHttpReferrer()}} to
> take a map of <string, string> so it can dynamically create a header for each
> request; currently that is not in there.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]