Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815: URL: https://github.com/apache/ozone/pull/9815#discussion_r3022994769 ## hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java: ## @@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, String keyPath, InputStr return Response.ok().status(HttpStatus.SC_OK).build(); } + String ifNoneMatch = getHeaders().getHeaderString( Review Comment: https://issues.apache.org/jira/browse/HDDS-14958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3022976423
##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
// This allows a key to be created an committed atomically if the original
has not
// been modified.
optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;
Review Comment:
Raised a ticket for this: https://issues.apache.org/jira/browse/HDDS-14957
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on PR #9815: URL: https://github.com/apache/ozone/pull/9815#issuecomment-4170993997 Thanks @ivandika3, @jojochuang for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
jojochuang commented on PR #9815: URL: https://github.com/apache/ozone/pull/9815#issuecomment-4170979373 Merged. Thanks @peterxcli and @ivandika3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
jojochuang merged PR #9815: URL: https://github.com/apache/ozone/pull/9815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on PR #9815: URL: https://github.com/apache/ozone/pull/9815#issuecomment-4168992944 @ivandika3 please take another look, Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815: URL: https://github.com/apache/ozone/pull/9815#discussion_r3020567161 ## hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java: ## @@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, String keyPath, InputStr return Response.ok().status(HttpStatus.SC_OK).build(); } + String ifNoneMatch = getHeaders().getHeaderString( Review Comment: I’m still thinking about how to refactor this(line 283-314), as the conditional request validation logic dominates the function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815: URL: https://github.com/apache/ozone/pull/9815#discussion_r3020567161 ## hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java: ## @@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, String keyPath, InputStr return Response.ok().status(HttpStatus.SC_OK).build(); } + String ifNoneMatch = getHeaders().getHeaderString( Review Comment: I’m still thinking about how to refactor this, as the conditional request validation logic dominates the function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020529584
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
Map tags = getTaggingFromHeaders(getHeaders());
+ boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
long putLength;
final String md5Hash;
- if (isDatastreamEnabled() && !enableEC && length >
getDatastreamMinLength()) {
+ if (isDatastreamEnabled() && !enableEC
+ && length > getDatastreamMinLength() && !hasConditionalHeaders) {
perf.appendStreamMode();
Pair keyWriteResult = ObjectEndpointStreaming
.put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(),
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);
Review Comment:
@peterxcli Seems you need to introduce conditional APIs for stream APIs
(i.e. `createStreamKey`) and pass the related conditional request parameter
(ifNoneMatch, etc) to `ObjectEndpointStreaming` or reparse them.
We also need to find out why the SDK integration test
`TestS3SDKWithRatisStreaming` does not fail the conditional request when they
are not supported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020538030
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
Map tags = getTaggingFromHeaders(getHeaders());
+ boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
long putLength;
final String md5Hash;
- if (isDatastreamEnabled() && !enableEC && length >
getDatastreamMinLength()) {
+ if (isDatastreamEnabled() && !enableEC
+ && length > getDatastreamMinLength() && !hasConditionalHeaders) {
perf.appendStreamMode();
Pair keyWriteResult = ObjectEndpointStreaming
.put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(),
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);
Review Comment:
ok, thanks for the context!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020529584
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
Map tags = getTaggingFromHeaders(getHeaders());
+ boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
long putLength;
final String md5Hash;
- if (isDatastreamEnabled() && !enableEC && length >
getDatastreamMinLength()) {
+ if (isDatastreamEnabled() && !enableEC
+ && length > getDatastreamMinLength() && !hasConditionalHeaders) {
perf.appendStreamMode();
Pair keyWriteResult = ObjectEndpointStreaming
.put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(),
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);
Review Comment:
@peterxcli Seems you need to introduce conditional APIs for stream APIs
(i.e. `createStreamKey`) and pass the related conditional request parameter
(ifNoneMatch, etc) to `ObjectEndpointStreaming` or reparse the header in
`ObjectEndpointStreaming` itself.
We also need to find out why the SDK integration test
`TestS3SDKWithRatisStreaming` does not fail the conditional request when they
are not supported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019990939
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+ OzoneBucket bucket, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+} else if (ifMatch != null) {
+ String expectedETag = parseETag(ifMatch);
+ return getClientProtocol().rewriteKeyIfMatch(
+ volumeName, bucketName, keyPath, length, expectedETag,
+ replicationConfig, customMetadata, tags);
+} else {
Review Comment:
you're right.
https://github.com/minio/minio/blob/7aac2a2c5b7c882e68c1ce017d8256be2feea27f/cmd/object-handlers-common.go#L343-L350
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019841629
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
Map tags = getTaggingFromHeaders(getHeaders());
+ boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
long putLength;
final String md5Hash;
- if (isDatastreamEnabled() && !enableEC && length >
getDatastreamMinLength()) {
+ if (isDatastreamEnabled() && !enableEC
+ && length > getDatastreamMinLength() && !hasConditionalHeaders) {
perf.appendStreamMode();
Pair keyWriteResult = ObjectEndpointStreaming
.put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(),
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);
Review Comment:
PutObject using streaming write should be able to handle conditional request
since the OpenKey and CommitKey metadata operations are the same as the
non-streaming put (only data operations are different).
We can then remove the `hasConditionalHeaders` boolean.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019861271
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,43 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String
bucketName, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(stripQuotes(ifNoneMatch.trim( {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+} else if (ifMatch != null) {
+ String expectedETag = parseETag(ifMatch);
+ return getClientProtocol().rewriteKeyIfMatch(
+ volumeName, bucketName, keyPath, length, expectedETag,
+ replicationConfig, customMetadata, tags);
+} else {
+ return getClientProtocol().createKey(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+}
+ }
+
+ /**
+ * Parses an ETag from a conditional header value, removing surrounding
+ * quotes if present.
+ */
+ static String parseETag(String headerValue) {
+if (headerValue == null) {
+ return null;
+}
+return stripQuotes(headerValue.trim());
+ }
Review Comment:
Can move this to `S3Utils`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019841629
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
Map tags = getTaggingFromHeaders(getHeaders());
+ boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
long putLength;
final String md5Hash;
- if (isDatastreamEnabled() && !enableEC && length >
getDatastreamMinLength()) {
+ if (isDatastreamEnabled() && !enableEC
+ && length > getDatastreamMinLength() && !hasConditionalHeaders) {
perf.appendStreamMode();
Pair keyWriteResult = ObjectEndpointStreaming
.put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(),
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);
Review Comment:
PutObject using streaming write should be able to handle conditional request
since the OpenKey and CommitKey operations are the same as the non-streaming
put.
We can then remove the `hasConditionalHeaders` boolean.
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -271,6 +280,15 @@ Response handlePutRequest(ObjectRequestContext context,
String keyPath, InputStr
return Response.ok().status(HttpStatus.SC_OK).build();
}
+ String ifNoneMatch = getHeaders().getHeaderString(
+ S3Consts.IF_NONE_MATCH_HEADER);
+ String ifMatch = getHeaders().getHeaderString(
+ S3Consts.IF_MATCH_HEADER);
+
+ if (ifNoneMatch != null && ifMatch != null) {
+throw newError(INVALID_REQUEST, keyPath);
+ }
Review Comment:
Can put an error message here since the `INVALID_REQUEST` is used in a lot
of places.
Also we can do more validation
- Fail for blank header
- Fail if `If-None-Match` is not "*"
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+ OzoneBucket bucket, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+} else if (ifMatch != null) {
+ String expectedETag = parseETag(ifMatch);
+ return getClientProtocol().rewriteKeyIfMatch(
+ volumeName, bucketName, keyPath, length, expectedETag,
+ replicationConfig, customMetadata, tags);
+} else {
Review Comment:
This seems to be valid behavior based on the
https://datatracker.ietf.org/doc/html/rfc7232#section-3.1. Should we implement
this?
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,43 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String
bucketName, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(stripQuotes(ifNoneMatch.trim( {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+} else if (ifMatch != null) {
+ String expectedETag = parseETag(ifMatch);
+ return getClientProtocol().rewriteKeyIfMatch(
+ volumeName, bucketName, keyPath, length, expectedETag,
+ replicationConfig, customMetadata, tags);
+} else {
+ return getClientProtocol().createKey(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+}
+ }
+
+ /**
+ * Parses an ETag from a conditional
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019238708
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
+ String bucketName, String keyName, long size,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+ openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+ }
+
+ @Override
+ @SuppressWarnings("checkstyle:parameternumber")
+ public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
Review Comment:
I refactor to let create/rewrite key type request to share same keyArgs
builder, but I keep those RpcClient interface separate their responsibility eg.
`rewriteKey()` for generation-based CAS and rewriteKeyIfMatch for ETAG-based CAS
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
Review Comment:
https://github.com/apache/ozone/pull/9815#discussion_r3019238708
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3017185075
##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
// This allows a key to be created an committed atomically if the original
has not
// been modified.
optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;
Review Comment:
good point. let's refactor the `expectedDataGeneration` and `expectedETag`
into another dedicated proto message as follow-up
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3013645782
##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
// This allows a key to be created an committed atomically if the original
has not
// been modified.
optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;
Review Comment:
Hm, I don't recall that we need to add `expectedDataGeneration` and
`expectedETag` in `KeyInfo`. Adding the two fields in KeyInfo requires
additional logic to null the two fields during commit to prevent space
overhead, but adding fields on two fields that will always be null in the
keyTable does not seem to be right. However, I don't think we can avoid it
since without setting the fields in the openKey, there might be concurrent
PutObjects that might violate the serial consistency guarantee. Ideally, we
should use a separate `OpenKeyInfo` for `openKeyTable` to differentiate with
the final `KeyInfo` in the `keyTable`, but I guess the ship has sailed a long
time ago.
Please let me know what you think. I'm OK if there are no other ways.
##
hadoop-ozone/integration-test-s3/src/test/java/org/apache/hadoop/ozone/s3/awssdk/v1/AbstractS3SDKV1Tests.java:
##
@@ -377,6 +377,103 @@ public void testPutObject() {
assertEquals("37b51d194a7513e45b56f6524f2d51f2",
putObjectResult.getETag());
}
+ @Test
+ public void testPutObjectIfNoneMatch() {
+final String bucketName = getBucketName();
+final String keyName = getKeyName();
+final String content = "bar";
+s3Client.createBucket(bucketName);
+
+InputStream is = new
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+ObjectMetadata metadata = new ObjectMetadata();
+metadata.setHeader("If-None-Match", "*");
+
+PutObjectResult putObjectResult = s3Client.putObject(bucketName, keyName,
is, metadata);
+assertEquals("37b51d194a7513e45b56f6524f2d51f2",
putObjectResult.getETag());
+ }
+
+ @Test
+ public void testPutObjectIfNoneMatchFail() {
+final String bucketName = getBucketName();
+final String keyName = getKeyName();
+final String content = "bar";
+s3Client.createBucket(bucketName);
+
+InputStream is = new
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+s3Client.putObject(bucketName, keyName, is, new ObjectMetadata());
+
+InputStream is2 = new
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+ObjectMetadata metadata = new ObjectMetadata();
+metadata.setHeader("If-None-Match", "*");
+
+AmazonServiceException ase = assertThrows(AmazonServiceException.class,
+() -> s3Client.putObject(bucketName, keyName, is2, metadata));
+
+assertEquals(ErrorType.Client, ase.getErrorType());
+assertEquals(412, ase.getStatusCode());
+assertEquals("PreconditionFailed", ase.getErrorCode());
+ }
Review Comment:
Nit: Please also help to add the post validation suggested in the acceptance
tests for relevant integration test methods (V1 and V2). Thanks.
##
hadoop-ozone/dist/src/main/smoketest/s3/conditionalput.robot:
##
@@ -0,0 +1,77 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+*** Settings ***
+Documentation S3 Conditional Put (If-None-Match / If-Match) tests
+Library OperatingSystem
+Library String
+Library Process
+Resource../commonlib.robot
+Resource./commonawslib.robot
+Test Timeout5 minutes
+Suite Setup Setup s3 tests
+
+*** Variables ***
+${ENDPOINT_URL} http://s3g:9878
+${BUCKET} generated
+
+*** Test Cases ***
+
+Conditional Put If-None-Match Star Creates New Key
+[Documentation]If-None-Match: * should succeed when key does not exist
+${key} = Set Variablecondput-ifnonematch-new
+ Execute echo "test-content" > /tmp/${key}
+${r
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
jojochuang commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3012857605
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
+ String bucketName, String keyName, long size,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+ openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+ }
+
+ @Override
+ @SuppressWarnings("checkstyle:parameternumber")
+ public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
Review Comment:
looks like we can refactor & combine rewriteKey() and rewriteKeyIfMatch()
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
Review Comment:
this method looks very much the same as createKey(), except the addition of
expectedDataGeneration field. We should refactor & merge them into the same
method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
Copilot commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3012836701
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+ OzoneBucket bucket, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
Review Comment:
If-None-Match handling only recognizes an unquoted `*` and otherwise
silently falls back to an unconditional createKey(). Per the design doc,
clients may send `If-None-Match: "*"`; in that case this code would incorrectly
allow overwrites. Consider normalizing/parsing the header value (similar to
parseETag) and rejecting unsupported If-None-Match values instead of ignoring
them.
```suggestion
if (ifNoneMatch != null) {
// Normalize the If-None-Match header value, accepting both quoted and
// unquoted "*" and rejecting any other value instead of silently
// falling back to an unconditional createKey().
String normalizedIfNoneMatch = parseETag(ifNoneMatch);
if ("*".equals(normalizedIfNoneMatch)) {
return getClientProtocol().createKeyIfNotExists(
volumeName, bucketName, keyPath, length, replicationConfig,
customMetadata, tags);
} else {
throw new OS3Exception(
PRECOND_FAILED,
"Unsupported If-None-Match header value",
HttpStatus.SC_PRECONDITION_FAILED);
}
```
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+ OzoneBucket bucket, String keyPath, long length,
Review Comment:
The openKeyForPut(...) helper takes an OzoneBucket parameter but does not
use it. Removing the unused parameter will simplify the signature and avoid
confusion about whether bucket-specific info is required for conditional puts.
```suggestion
String keyPath, long length,
```
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
}
}
}
-
+
+ /**
+ * Opens a key for put, applying conditional write logic based on
+ * If-None-Match and If-Match headers.
+ */
+ @SuppressWarnings("checkstyle:ParameterNumber")
+ private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+ OzoneBucket bucket, String keyPath, long length,
+ ReplicationConfig replicationConfig, Map customMetadata,
+ Map tags, String ifNoneMatch, String ifMatch)
+ throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+ return getClientProtocol().createKeyIfNotExists(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+} else if (ifMatch != null) {
+ String expectedETag = parseETag(ifMatch);
+ return getClientProtocol().rewriteKeyIfMatch(
+ volumeName, bucketName, keyPath, length, expectedETag,
+ replicationConfig, customMetadata, tags);
+} else {
+ return getClientProtocol().createKey(
+ volumeName, bucketName, keyPath, length, replicationConfig,
+ customMetadata, tags);
+}
+ }
+
+ /**
+ * Parses an ETag from a conditional header value, removing surrounding
+ * quotes if present.
+ */
+ static String parseETag(String headerValue) {
+if (headerValue == null) {
+ return null;
+}
+String etag = headerValue.trim();
+if (etag.startsWith("\"") && etag.endsWith("\"")) {
+ return etag.substring(1, etag.length() - 1);
+}
+return etag;
Review Comment:
parseETag duplicates quote-stripping logic that already exists as
S3Utils.stripQuotes (and is statically imported in this class). Consider
reusing stripQuotes(headerValue.trim()) to keep ETag n
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
jojochuang commented on PR #9815: URL: https://github.com/apache/ozone/pull/9815#issuecomment-4144195448 Please rebase @peterxcli -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2999725804
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -187,8 +188,18 @@ public Response put(
throw newError(S3ErrorTable.NO_SUCH_BUCKET, bucketName, ex);
} else if (ex.getResult() == ResultCodes.FILE_ALREADY_EXISTS) {
throw newError(S3ErrorTable.NO_OVERWRITE, keyPath, ex);
+ } else if (ex.getResult() == ResultCodes.KEY_ALREADY_EXISTS) {
+throw newError(PRECOND_FAILED, keyPath, ex);
Review Comment:
I think the `ex` here we are using to create the new error already contain
the error info from OM.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2986933629
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1428,6 +1428,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
+ String bucketName, String keyName, long size,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+ openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+ }
+
+ @Override
+ @SuppressWarnings("checkstyle:parameternumber")
+ public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
+ String bucketName, String keyName, long size, String expectedETag,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support conditional key rewrite.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedETag(expectedETag);
Review Comment:
no, please take a look at the design doc:
https://github.com/apache/ozone/blob/5e5243eca2a28ba5127be5e10bba97a99adf9d52/hadoop-hdds/docs/content/design/s3-conditional-requests.md?plain=1#L144-L167
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
jojochuang commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2898576271
##
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/exceptions/OMException.java:
##
@@ -267,13 +267,17 @@ public enum ResultCodes {
UNAUTHORIZED,
S3_SECRET_ALREADY_EXISTS,
-
+
Review Comment:
please do not commit changes that unrelated.
##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1428,6 +1428,68 @@ public OzoneOutputStream rewriteKey(String volumeName,
String bucketName, String
return createOutputStream(openKey);
}
+ @Override
+ public OzoneOutputStream createKeyIfNotExists(String volumeName,
+ String bucketName, String keyName, long size,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+ openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+ }
+
+ @Override
+ @SuppressWarnings("checkstyle:parameternumber")
+ public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
+ String bucketName, String keyName, long size, String expectedETag,
+ ReplicationConfig replicationConfig, Map metadata,
+ Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+ throw new IOException(
+ "OzoneManager does not support conditional key rewrite.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedETag(expectedETag);
Review Comment:
does it set setExpectedDataGeneration here?
##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -187,8 +188,18 @@ public Response put(
throw newError(S3ErrorTable.NO_SUCH_BUCKET, bucketName, ex);
} else if (ex.getResult() == ResultCodes.FILE_ALREADY_EXISTS) {
throw newError(S3ErrorTable.NO_OVERWRITE, keyPath, ex);
+ } else if (ex.getResult() == ResultCodes.KEY_ALREADY_EXISTS) {
+throw newError(PRECOND_FAILED, keyPath, ex);
Review Comment:
would you like to consider having different error messages for different
cases?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
github-actions[bot] commented on PR #9334: URL: https://github.com/apache/ozone/pull/9334#issuecomment-3726444723 This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
sodonnel commented on PR #9334: URL: https://github.com/apache/ozone/pull/9334#issuecomment-3667108609 @errose28 and @kerneltime You guys had some interest in a fuller implementation of the "conditional write" api when I was doing the atomic rewrite change. Would be good to get your thoughts on this design doc too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627158712 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627142467 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627135904 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627135904 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627120613 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2627120613 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
sodonnel commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2626769568 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists s
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
sodonnel commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2626759201 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists s
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2625676245 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2621708465 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2621425881 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2621425881 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,194 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KET_GENERATION_MISMATCH`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on PR #9334: URL: https://github.com/apache/ozone/pull/9334#issuecomment-3650374609 > Thanks for iterating for this, the direction is good. Should we separate the design docs and the actual implementations? Sounds good. I’ll first revert the code changes, then we can merge this design doc first. > Let's be more permissive and not fail the precondition if ETag metadata does not exist Agreed. will update the doc accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2616667929 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,180 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes + +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads + +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy + +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## Specification + +### AWS S3 Conditional Write Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### AWS S3 Conditional Read Specification + +TODO + +### AWS S3 Conditional Copy Specification + +TODO + +## Implementation + +### AWS S3 Conditional Write Implementation + +The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict atomicity for conditional operations. + +- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability ([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")). +- **If-Match** optimizes the happy path by pushing ETag validation directly into the Ozone Manager's write path, avoiding preliminary read operations. + + If-None-Match Implementation + +This implementation ensures strict create-only semantics by utilizing a specific generation ID marker. + +In `OzoneConsts.java`, add the `-1` as a constant for readability: +```java +/** + * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" semantics. + * When used with If-None-Match conditional requests, this ensures atomicity: + * if a concurrent write commits between Create and Commit phases, the commit + * fails the validation check, preserving strict create-if-not-exists semantics. + */ +public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L; +``` + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. OM receives request with `expectedDataGeneration == OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`. +2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw `KEY_ALREADY_EXISTS`. +3. If not exists, proceed to create the open key entry. + +# OM Commit Phase (Atomicity) + +1. During the commit phase (or strict atomic create), the OM validates that the key still does not exist. +2. If a concurrent client created the key between the Create and Commit phases, the transaction fails with `KEY_ALREADY_EXISTS`. + +# Race Condition Handling + +Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, +Client A's commit fails the `CREATE IF NOT EXISTS` validation check, preserving strict create-if-not-exists semanti
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on PR #9334: URL: https://github.com/apache/ozone/pull/9334#issuecomment-3592409850 @ivandika3 @chungen0126 I’ve refined the design—please take another look. --- Regarding the TODO: I plan to evolve the design and code together across patches: 1) Initial patch: introduce the design, fully detail “conditional write,” and outline high-level approaches for get/copy. 2) Conditional get: complete the remaining design details for conditional get and include the corresponding code changes. 3) Conditional copy: complete the remaining design details for conditional copy and include the corresponding code changes. Let me know if this workflow sounds feasible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
chungen0126 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2564347509 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: > preExecute can be called in parallel (in multiple OM handler threads), so we should instead verify the ETag in validateAndUpdateCache instead to ensure atomicity (i.e. if there are two identical "If-Match" requests with the same ETag, only one will succeed). Note that permission check was put to preExecute for performance reasons and the community discussed that consistency tradeoff is acceptable. I see. Initially, I added the pre-check specifically to `preExecute` to handle `S3 412 Precondition Failed` scenarios explicitly. However, I realized that this leads to redundant read operations. I agree with your point suggestion to consolidate all the verification logic into `validateAndUpdateCache()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: issues-unsubscr...
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
chungen0126 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2564347509 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: > preExecute can be called in parallel (in multiple OM handler threads), so we should instead verify the ETag in validateAndUpdateCache instead to ensure atomicity (i.e. if there are two identical "If-Match" requests with the same ETag, only one will succeed). Note that permission check was put to preExecute for performance reasons and the community discussed that consistency tradeoff is acceptable. I see. Initially, I added the pre-check specifically to `preExecute` to handle S3 412 Precondition Failed scenarios explicitly. However, I realized that this leads to redundant read operations. I agree with your point suggestion to consolidate all the verification logic into `validateAndUpdateCache()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: issues-unsubscr...@o
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: > Note that verifying ETag during the preexecute phase does not increase the overhead of writing to the Raft log, so we don't need to worry about that. preExecute can be called in parallel (in multiple OM handler threads), so we should instead verify the ETag in `validateAndUpdateCache` instead to ensure atomicity (i.e. if there are two identical "If-Match" requests with the same ETag, only one will succeed). Note that permission check was put to preExecute for performance reasons and the community discussed that consistency tradeoff is acceptable. > agree that we need to reduce the RTT for If-Match request, my original thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, but seems there are already a lots of them, I think it's ok to do so, plus the performance would be better. Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used only for s3 use case. > So If-Match request dont need the atomic key rewrite anymore. But how about we keep the if-none-match request to use the atomic with extended "CREATE IF NOT EXIST" capability, which will be added in https://github.
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: > Note that verifying ETag during the preexecute phase does not increase the overhead of writing to the Raft log, so we don't need to worry about that. preExecute can be called in parallel (in multiple OM handler threads), so we should instead verify the ETag in `validateAndUpdateCache` instead to ensure atomicity. Note that permission check was put to preExecute for performance reasons and the community discussed that consistency tradeoff is acceptable. > agree that we need to reduce the RTT for If-Match request, my original thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, but seems there are already a lots of them, I think it's ok to do so, plus the performance would be better. Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used only for s3 use case. > So If-Match request dont need the atomic key rewrite anymore. But how about we keep the if-none-match request to use the atomic with extended "CREATE IF NOT EXIST" capability, which will be added in https://github.com/apache/ozone/pull/9332 Yes, I'm OK with reusing atomic rewrite for "if-none-match" so w
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: > Note that verifying ETag during the preexecute phase does not increase the overhead of writing to the Raft log, so we don't need to worry about that. preExecute can be called in parallel (in multiple OM handler threads), so we should instead verify the ETag in `validateAndUpdateCache` instead to ensure atomicity. Note that permission check was put to preExecute for performance reasons and the community discussed that consistency tradeoff is acceptable. > agree that we need to reduce the RTT for If-Match request, my original thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, but seems there are already a lots of them, I think it's ok to do so, plus the performance would be better. Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used only for s3 use case. > So If-Match request dont need the atomic key rewrite anymore. But how about we keep the if-none-match request to use the atomic with extended "CREATE IF NOT EXIST" capability, which will be added in https://github.com/apache/ozone/pull/9332 Yes, I'm OK with reusing atomic rewrite for "if-none-match" so w
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2563334450 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: @ivandika3 @chungen0126 Thanks for reviewing. 1. agree that we need to reduce the RTT for `If-Match` request, my original thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, but seems there are already a lots of them, I think it's ok to do so, plus the performance would be better. 2. So `If-Match` request dont need the atomic key rewrite anymore. But how about we keep the `if-none-match` request to use the atomic with extended "CREATE IF NOT EXIST" capability, which will be added in https://github.com/apache/ozone/pull/9332 cc @sodonnel BTW, sorry im a little bit busy these day, I'll refine the design ASAP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
chungen0126 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2559512853 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: I guess we are trying to minimize changes, which is why we are adopting the existing object store APIs. But you're right, reducing redundant RPCs is a great idea. Note that verifying ETag during the preexecute phase does not increase the overhead of writing to the Raft log, so we don't need to worry about that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
peterxcli commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2558648096 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. Review Comment: You're right. Thanks for catching that! I'll remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2558578333 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. + +# OM Commit Phase + +1. Check `expectedDataGeneration == -1` from open key. +2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`. +3. Commit key. + +# Race Condition Handling + +Using `-1` ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the `-1` validation check (key now exists), preserving strict create-if-not-exists semantics. + + If-Match Implementation + +Leverages existing `expectedDataGeneration` from HDDS-10656: + +# S3 Gateway Layer + +1. Parse `If-Match: ""` header +2. Look up existing key via `getS3KeyDetails()` +3. Validate ETag matches, else throw `PRECOND_FAILED` (412) +4. Extract `expectedGeneration` from existing key +5. Pass `expectedGeneration` to RpcClient Review Comment: Just for my understanding, the reason of calling `getS3KeyDetails` is to not send a write request if precondition failed and therefore there is no Raft log and will not block applier thread? I think this tradeoff is whether we want to prioritize the optimize the latency happy path (precondition pass) or the precondition failed path (precondition failed). IMO, in normal workloads (and under optimistic concurrenc control), we assume that the happy path should happen more often and therefore we can validate the ETag key metadata during the key write. This will add another optional field of KeyArgs (e.g. `expectedETag`), but I think it's fine. Please also note that not all Ozone keys will have ETag (e.g. keys uploaded using OFS protocol), so we might want to specify whether we want to 1) skip the keys without ETag metadata or 2) calculate the ETag on the spot. I prefer (1) since it's the most lightweight implementation. Approach (2) might justify your approach of loading the key and calculating the ETag in S3G instead in OM applier thread, but there might be some overhead and also for MPU key the calculation of ETag is more c
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on PR #9334: URL: https://github.com/apache/ozone/pull/9334#issuecomment-3573833368 @hevinhsu Could you help to take a look as well? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]
ivandika3 commented on code in PR #9334: URL: https://github.com/apache/ozone/pull/9334#discussion_r2558498272 ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operations**: Copy only if source/destination meets specific conditions +- **Prevent overwrite**: Copy only if destination doesn't exist + +## AWS S3 Conditional Write + +### Specification + + If-None-Match Header + +``` +If-None-Match: "*" +``` + +- Succeeds only if object does NOT exist +- Returns `412 Precondition Failed` if object exists +- Primary use case: Create-only semantics + + If-Match Header + +``` +If-Match: "" +``` + +- Succeeds only if object EXISTS and ETag matches +- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches +- Primary use case: Atomic updates (compare-and-swap) + + Restrictions + +- Cannot use both headers together in same request +- No additional charges for failed conditional requests + +### Implementation + + Architecture Overview + + If-None-Match Implementation + +# S3 Gateway Layer + +1. Parse `If-None-Match: *`. +2. Set `existingKeyGeneration = -1`. +3. Call `RpcClient.rewriteKey()`. + +# OM Create Phase + +1. Validate `expectedDataGeneration == -1`. +2. If key exists → throw `KEY_ALREADY_EXISTS`. +3. Store `-1` in open key metadata. Review Comment: Is it necessary to store `-1` on the open key metadata? I think we can set expectedGeneration in `BlockOutputStreamEntryPool.keyArgs` which will be checked by the `OmKeyCommitRequest`. ## hadoop-hdds/docs/content/design/s3-conditional-requests.md: ## @@ -0,0 +1,149 @@ +--- +title: "S3 Conditional Requests" +summary: Design to support S3 conditional requests for atomic operations. +date: 2025-11-20 +jira: HDDS-13117 +status: draft +author: Chu Cheng Li +--- + +# S3 Conditional Requests Design + +## Background + +AWS S3 supports conditional requests using HTTP conditional headers, enabling atomic operations, cache optimization, and preventing race conditions. This includes: + +- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers for atomic operations +- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, `If-Modified-Since`, `If-Unmodified-Since` for cache validation +- **Conditional Copy** (CopyObject): Conditions on both source and destination objects + +### Current State + +- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration` +- OM HA uses single Raft group with single applier thread (Ratis StateMachineUpdater) +- S3 gateway doesn't expose conditional headers to OM layer + +## Use Cases + +### Conditional Writes +- **Atomic key rewrites**: Prevent race conditions when updating existing objects +- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`) +- **Optimistic locking**: Enable concurrent access with conflict detection +- **Leader election**: Implement distributed coordination using S3 as backing store + +### Conditional Reads +- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not Modified) +- **HTTP caching**: Support standard browser/CDN caching semantics +- **Conditional processing**: Only process objects that meet specific criteria + +### Conditional Copy +- **Atomic copy operati
