Re: [I] S3 + HTTP/2: "duplicate" header [arrow-rs-object-store]

via GitHub Fri, 21 Mar 2025 02:27:50 -0700


crepererum commented on issue #49:
URL: 
https://github.com/apache/arrow-rs-object-store/issues/49#issuecomment-2742800575


   > As I noted previously, [for AWSv4 signed requests, the data being signed 
differs](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sigv-create-signed-request.html#create-canonical-request)
 depending on whether it is sent over HTTP/1.1 (`Host` header only) or HTTP/2 
(`:authority` header and maybe _also_ `Host` header).
   
   That's not what the linked docs say, they say:
   
   > You must include the host header (HTTP/1.1) **or** the :authority header 
(HTTP/2)
   
   So I guess it's rather unclear from the AWS side what should happen if you 
send a `host` header in an HTTP/2 request. The fact that the header key/name 
and not only the value is included in the signature is a bit weird and leads to 
the problem you're describing regarding the proxy behavior. I assume that this 
is somewhat of an oversight on the AWS site and hasn't been an issue for them 
simply because AWS S3 does only support HTTP/1.1
   
   > However, as these headers are included in AWSv4 signing, a HTTP/1.1 
upstream implementing AWSv4 signing would need to know that the request was 
originally sent with HTTP/2 **and** whether it originally included a `Host` 
header (as it may have been copied from `:authority`, but must not _differ_ 
when _both_ are provided) in order to accurately reproduce the client's request 
signing process.
   
   As far as I can tell, there's simply NO way you can tell what the original 
protocol version was. The relevant spec is likely 
[rfc7239](https://datatracker.ietf.org/doc/html/rfc7239) and it does NOT 
include the version in any of its headers (although it defines the possibility 
for future extensions).
   
   ---
   
   > A simpler alternative is that `arrow-rs` could be configured to send 
requests exclusively using one version of HTTP that is chosen ahead of time by 
the library developer (or a configuration option available in their 
application). The downsides are that if a service starts providing HTTP/2, 
clients will need to update their configuration to take advantage of it, and 
that clients would break if they ever _stopped_ offering a particular HTTP 
version that they were configured to use.
   
   I think that's a good mitigation / config. Most deployments should know if 
HTTP/2 is available (and preferred) or not, since this also has implications on 
load balancing and potentially the network fabric. I guess for most people the 
rule is simple: use HTTP/1.1 (which is the only version supported by AWS) and 
only switch to HTTP/2 if you have a specific setup that allows & demands it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] S3 + HTTP/2: "duplicate" header [arrow-rs-object-store]

Reply via email to