steveloughran commented on PR #15171:
URL: https://github.com/apache/iceberg/pull/15171#issuecomment-3993187579
ok, this is what I think would work
A immutable set of the headers not to sign visible to all
```
public static final Set<String> UNSIGNED_HEADERS =
ImmutableSet.of(
"Content-Type",
"amz-sdk-invocation-id",
"amz-sdk-retry",
"if-match",
"if-modified-since",
"if-none-match",
"if-unmodified-since",
"range",
"referer",
"user-agent",
"x-amz-date",
"x-amz-content-sha256"
);
```
(we should check with some s3/aws people about x-amz-content. If it's there
and is a checksum, it signs the payload, so a different payload would fail. But
as the only POST with a payload is bulk DELETE, it's not really an issue)
The key to the cache becomes: request with the unsigned headers stripped
the value in the cache becomes: response with the unsigned headers stripped.
lookup and prepare becomes
1. strip out unsigned headers for lookup
2. look for the cached entry
3. if found, combine the full request with the cached value
4. if not: issue full request
Tests to make sure that none of the stripped headers from request 1 are in
request 2, and that all signed values are in.
(extension 1) Maybe, when a response comes back, client scans the auth
header/uri and if any of these have been signed warns & doesn't cache,
otherwise strips them out from the cache. That's for safe coexistence with
other implementations with a smaller list of headers to strip before signing.
(Or we could ignore this as extra complexity, tests, maintenance?)
(extension 2) I think I'd consider adding an emergency switch to disable
cacheing. I know it hurts the catalog, but what if some corner case does
surface: you don't want to break every client request.
@adutra I could try this tomorrow building my work on top of yours. I've
done that bit of the auth header parsing I propose in extension 1, but it's
untested and will be extra complexity. I'd leave it out of the first iteration
thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]