brankogrb-db opened a new issue, #747: URL: https://github.com/apache/arrow-rs-object-store/issues/747
**Describe the bug** `object_store::aws::resolve_bucket_region` builds its HeadBucket endpoint using virtual-hosted-style addressing ([src/aws/resolve.rs#L52](https://github.com/apache/arrow-rs-object-store/blob/main/src/aws/resolve.rs)): ```rust let endpoint = format!("https://{bucket}.s3.amazonaws.com"); ``` S3 serves this endpoint with the wildcard certificate `*.s3.amazonaws.com`, and a TLS wildcard matches only a single DNS label. For a bucket name containing dots (e.g. `my.dotted.bucket`), the hostname `my.dotted.bucket.s3.amazonaws.com` has multiple labels under the wildcard, so certificate verification fails and the connection is rejected during the TLS handshake — before any HTTP request is sent: ``` Generic S3 error: Error performing HEAD https://my.dotted.bucket.s3.amazonaws.com in 30ms - HTTP error: error sending request caused by: client error (Connect): invalid peer certificate: certificate not valid for name "my.dotted.bucket.s3.amazonaws.com" ``` This is a documented AWS limitation, not a transient issue. From [Virtual hosting of general purpose buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html): > When you're using virtual-hosted–style general purpose buckets with SSL, the SSL wildcard certificate matches only buckets that do not contain dots (`.`). Notably, the rest of the crate is unaffected: `AmazonS3Builder` defaults to path-style requests (`virtual_hosted_style_request = false`), so all data operations on dotted buckets work fine. `resolve_bucket_region` is the only code path that unconditionally places the bucket name in the TLS hostname. **To Reproduce** ```rust // Against any real bucket with a dot in its name: object_store::aws::resolve_bucket_region("my.dotted.bucket", &ClientOptions::new()).await // => fails TLS verification; never reaches S3 ``` **Expected behavior** The region resolves successfully. The fix is small: for bucket names containing dots, use path-style addressing for the HeadBucket request: ```rust let endpoint = if bucket.contains('.') { format!("https://s3.amazonaws.com/{bucket}") } else { format!("https://{bucket}.s3.amazonaws.com") }; ``` The rest of the function is unchanged — the `x-amz-bucket-region` header is returned the same way for path-style requests (on 200, 301, and 403 responses; 404 still means the bucket does not exist). We have validated this end-to-end against a real dotted bucket: the path-style HEAD to the global endpoint returns `301 Moved Permanently` with `x-amz-bucket-region` set, exactly as for the virtual-hosted variant. Falling back to path-style only when the name contains dots mirrors what the official AWS SDKs do, and avoids any behavior change for the common case. Regarding path-style deprecation: AWS [indefinitely delayed it](https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/) precisely because buckets with dots have no virtual-hosted HTTPS alternative, so this fallback does not depend on a deprecated-and-removable behavior. Happy to submit a PR for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
