rich7420 opened a new pull request, #10472:
URL: https://github.com/apache/ozone/pull/10472
## What changes were proposed in this pull request?
The S3 Gateway never persisted an object's `Content-Type`. The request
`Content-Type` was only forwarded to `HeaderPreprocessor` for SigV4 signing
(`X-Ozone-Original-Content-Type`) and was never stored in the OM key
metadata;
`GetObject` did not set it from the object and `HeadObject` hard-coded
`binary/octet-stream`. Because the value is never stored, `CopyObject` /
`UploadPartCopy` cannot preserve or replace it, so copied (and plain) objects
consistently default to octet-stream.
This is an end-to-end gap, not a copy-only issue. This PR adds object
`Content-Type` support across the read/write/copy paths, aligning with AWS
S3:
* **PutObject / CreateMultipartUpload** — store the request `Content-Type` in
the OM key metadata, mirroring the existing ETag-in-metadata convention.
* **GetObject / HeadObject** — return the stored `Content-Type`, defaulting
to
`binary/octet-stream` when absent (the `response-content-type` query
parameter still overrides).
* **CopyObject** metadata directive — `COPY` (default) preserves the source
object's `Content-Type` (it rides along in the copied source metadata);
`REPLACE` applies the `Content-Type` from the copy request.
* A user-supplied `x-amz-meta-content-type` is remapped to a dedicated key so
it does not collide with the object's stored `Content-Type`. The
reserved-key
remap/rebuild that previously existed only for ETag is now factored into a
shared helper and reused for both.
Before vs after (response):
| Path | Before | After |
| --- | --- | --- |
| PUT/MPU | Content-Type not persisted | stored in key metadata |
| GET | only set from request header / `response-content-type` | falls back
to stored Content-Type |
| HEAD | hard-coded `binary/octet-stream` | stored Content-Type (or default)
|
| COPY (COPY) | dest defaults to octet-stream | keeps source Content-Type |
| COPY (REPLACE) | dest defaults to octet-stream | uses request Content-Type
|
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15515
## How was this patch tested?
* New unit tests:
* `TestObjectGet#getAndHeadReturnStoredContentType` — GET and HEAD return
the
stored `Content-Type` (and the `binary/octet-stream` default when none
was
stored).
* `TestObjectPut#testContentTypeStoredAndCopied` — PUT stores it;
CopyObject
`COPY` keeps the source value; `REPLACE` uses the request value.
* Existing `TestObjectGet` (`inheritRequestHeader`,
`overrideResponseHeader`)
and the multipart/copy tests pass unchanged.
* Verified end-to-end against a local docker-compose cluster with boto3: the
behaviours of the Ceph s3-tests `test_object_copy_retaining_metadata`,
`test_object_copy_replacing_metadata` and
`test_object_copy_verify_contenttype`
now pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]