danielcweeks commented on PR #15171: URL: https://github.com/apache/iceberg/pull/15171#issuecomment-3998493597
It's been a quite a while since this was originally introduced, but I'm a little hesitant to make changes here. There are multiple implementations where this works in practice, so I'd like to explore closely what the real issue is. Part of complexity of building the server side implementation of this is knowing what to sign and what not to, but I think that's pretty general (though we don't currently provide any guidance). For example, the `x-amz-content-sha256` header is useful if you have a specific payload, but in streaming implementations, you want to start the request you may choose to use `UNSIGNED-PAYLOAD` (though I believe they've since introduced some streaming chunked options as well). For cases like range based requests, not signing the range is generally preferred. The reason being that the individual requests coming from parquet will often skip or make non-contiguous requests and we don't want to hit the server for each of those requests. There are scenarios where you do need strict validation of some of the headers (like for a bulk delete operation where the content needs to be signed and why we include an optional body in the sign request) because you don't want to authorized a bulk operation that hasn't been fully validated. Adding more to the cache could dramatically impact performance in the normal read path. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
