plusplusjiajia commented on code in PR #3120:
URL: https://github.com/apache/iceberg-python/pull/3120#discussion_r2969282129
##########
pyiceberg/catalog/rest/__init__.py:
##########
@@ -710,17 +718,27 @@ def add_headers(self, request: PreparedRequest, **kwargs:
Any) -> None: # pylin
# remove the connection header as it will be updated after
signing
if "connection" in request.headers:
del request.headers["connection"]
- # For empty bodies, explicitly set the content hash header to
the SHA256 of an empty string
- if not request.body:
- request.headers["x-amz-content-sha256"] = EMPTY_BODY_SHA256
+
+ # Compute the x-amz-content-sha256 header to match Iceberg
Java SDK:
+ # - empty body → hex (EMPTY_BODY_SHA256)
+ # - non-empty body → base64
+ if request.body:
+ body_bytes = request.body.encode("utf-8") if
isinstance(request.body, str) else request.body
+ content_sha256_header =
base64.b64encode(hashlib.sha256(body_bytes).digest()).decode()
Review Comment:
> i dont think we need to base64 encode this. the [java
implementation](https://github.com/apache/iceberg/blob/c9f6c8423be7cbf52c9cf678896deca478b3b6d1/aws/src/main/java/org/apache/iceberg/aws/RESTSigV4AuthSession.java#L93-L130)
doesnt do this. neither does the [python reference
implementation](https://gist.github.com/marcogrcr/6f0645b20847be4ef9cd6742427fc97b#file-sigv4_using_http_client-py-L35-L43)
Thanks for the review! Really appreciate you taking the time to look through
this carefully.
However, I believe the Java implementation actually does use base64 for
non-empty bodies. We can see
[TestRESTSigV4AuthSession.java#L174](https://github.com/apache/iceberg/blob/a89f1f9aa/aws/src/test/java/org/apache/iceberg/aws/TestRESTSigV4AuthSession.java#L174)
: the expected value for x-amz-content-sha256 is
"yc5oAKPWjHY4sW8XQq0l/3aNrrXJKBycVFNnDEGMfww=" (base64), not hex. This is
because Java uses
[RESTSigV4AuthSession.java#L100-L104](https://github.com/apache/iceberg/blob/fec9800bc/aws/src/main/java/org/apache/iceberg/aws/RESTSigV4AuthSession.java#L100-L104)
which produces base64-encoded checksums. Only empty bodies use hex as a
[RESTSigV4AuthSession.java#L119-L121](https://github.com/apache/iceberg/blob/fec9800bc/aws/src/main/java/org/apache/iceberg/aws/RESTSigV4AuthSession.java#L119-L121)
(see also the
[TestRESTSigV4AuthSession.java#L118](https://github.com/apache/iceberg/blob/a89f1f9aa/aws/src/test/java/org/apache/iceberg/aws/TestRESTSigV4AuthSession.java#L118)).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]