Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3022994769


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
 return Response.ok().status(HttpStatus.SC_OK).build();
   }
 
+  String ifNoneMatch = getHeaders().getHeaderString(

Review Comment:
   https://issues.apache.org/jira/browse/HDDS-14958



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3022976423


##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
   // This allows a key to be created an committed atomically if the original 
has not
   // been modified.
 optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;

Review Comment:
   Raised a ticket for this: https://issues.apache.org/jira/browse/HDDS-14957



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on PR #9815:
URL: https://github.com/apache/ozone/pull/9815#issuecomment-4170993997

   Thanks @ivandika3, @jojochuang for reviewing!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


jojochuang commented on PR #9815:
URL: https://github.com/apache/ozone/pull/9815#issuecomment-4170979373

   Merged. Thanks @peterxcli and @ivandika3 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


jojochuang merged PR #9815:
URL: https://github.com/apache/ozone/pull/9815


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on PR #9815:
URL: https://github.com/apache/ozone/pull/9815#issuecomment-4168992944

   @ivandika3 please take another look, Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020567161


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
 return Response.ok().status(HttpStatus.SC_OK).build();
   }
 
+  String ifNoneMatch = getHeaders().getHeaderString(

Review Comment:
   I’m still thinking about how to refactor this(line 283-314), as the 
conditional request validation logic dominates the function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020567161


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -271,6 +280,38 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
 return Response.ok().status(HttpStatus.SC_OK).build();
   }
 
+  String ifNoneMatch = getHeaders().getHeaderString(

Review Comment:
   I’m still thinking about how to refactor this, as the conditional request 
validation logic dominates the function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020529584


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
   getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
   Map tags = getTaggingFromHeaders(getHeaders());
 
+  boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
   long putLength;
   final String md5Hash;
-  if (isDatastreamEnabled() && !enableEC && length > 
getDatastreamMinLength()) {
+  if (isDatastreamEnabled() && !enableEC
+  && length > getDatastreamMinLength() && !hasConditionalHeaders) {
 perf.appendStreamMode();
 Pair keyWriteResult = ObjectEndpointStreaming
 .put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(), 
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);

Review Comment:
   @peterxcli Seems you need to introduce conditional APIs for stream APIs 
(i.e. `createStreamKey`) and pass the related conditional request parameter 
(ifNoneMatch, etc) to `ObjectEndpointStreaming` or reparse them.
   
   We also need to find out why the SDK integration test 
`TestS3SDKWithRatisStreaming` does not fail the conditional request when they 
are not supported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020538030


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
   getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
   Map tags = getTaggingFromHeaders(getHeaders());
 
+  boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
   long putLength;
   final String md5Hash;
-  if (isDatastreamEnabled() && !enableEC && length > 
getDatastreamMinLength()) {
+  if (isDatastreamEnabled() && !enableEC
+  && length > getDatastreamMinLength() && !hasConditionalHeaders) {
 perf.appendStreamMode();
 Pair keyWriteResult = ObjectEndpointStreaming
 .put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(), 
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);

Review Comment:
   ok, thanks for the context!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-04-01 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3020529584


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
   getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
   Map tags = getTaggingFromHeaders(getHeaders());
 
+  boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
   long putLength;
   final String md5Hash;
-  if (isDatastreamEnabled() && !enableEC && length > 
getDatastreamMinLength()) {
+  if (isDatastreamEnabled() && !enableEC
+  && length > getDatastreamMinLength() && !hasConditionalHeaders) {
 perf.appendStreamMode();
 Pair keyWriteResult = ObjectEndpointStreaming
 .put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(), 
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);

Review Comment:
   @peterxcli Seems you need to introduce conditional APIs for stream APIs 
(i.e. `createStreamKey`) and pass the related conditional request parameter 
(ifNoneMatch, etc) to `ObjectEndpointStreaming` or reparse the header in 
`ObjectEndpointStreaming` itself.
   
   We also need to find out why the SDK integration test 
`TestS3SDKWithRatisStreaming` does not fail the conditional request when they 
are not supported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019990939


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+  OzoneBucket bucket, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+} else if (ifMatch != null) {
+  String expectedETag = parseETag(ifMatch);
+  return getClientProtocol().rewriteKeyIfMatch(
+  volumeName, bucketName, keyPath, length, expectedETag,
+  replicationConfig, customMetadata, tags);
+} else {

Review Comment:
   you're right.
   
   
https://github.com/minio/minio/blob/7aac2a2c5b7c882e68c1ce017d8256be2feea27f/cmd/object-handlers-common.go#L343-L350



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019841629


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
   getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
   Map tags = getTaggingFromHeaders(getHeaders());
 
+  boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
   long putLength;
   final String md5Hash;
-  if (isDatastreamEnabled() && !enableEC && length > 
getDatastreamMinLength()) {
+  if (isDatastreamEnabled() && !enableEC
+  && length > getDatastreamMinLength() && !hasConditionalHeaders) {
 perf.appendStreamMode();
 Pair keyWriteResult = ObjectEndpointStreaming
 .put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(), 
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);

Review Comment:
   PutObject using streaming write should be able to handle conditional request 
since the OpenKey and CommitKey metadata operations are the same as the 
non-streaming put (only data operations are different).
   
   We can then remove the `hasConditionalHeaders` boolean.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019861271


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,43 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String 
bucketName, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(stripQuotes(ifNoneMatch.trim( {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+} else if (ifMatch != null) {
+  String expectedETag = parseETag(ifMatch);
+  return getClientProtocol().rewriteKeyIfMatch(
+  volumeName, bucketName, keyPath, length, expectedETag,
+  replicationConfig, customMetadata, tags);
+} else {
+  return getClientProtocol().createKey(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+}
+  }
+
+  /**
+   * Parses an ETag from a conditional header value, removing surrounding
+   * quotes if present.
+   */
+  static String parseETag(String headerValue) {
+if (headerValue == null) {
+  return null;
+}
+return stripQuotes(headerValue.trim());
+  }

Review Comment:
   Can move this to `S3Utils`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019841629


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -281,21 +299,24 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
   getCustomMetadataFromHeaders(getHeaders().getRequestHeaders());
   Map tags = getTaggingFromHeaders(getHeaders());
 
+  boolean hasConditionalHeaders = ifNoneMatch != null || ifMatch != null;
   long putLength;
   final String md5Hash;
-  if (isDatastreamEnabled() && !enableEC && length > 
getDatastreamMinLength()) {
+  if (isDatastreamEnabled() && !enableEC
+  && length > getDatastreamMinLength() && !hasConditionalHeaders) {
 perf.appendStreamMode();
 Pair keyWriteResult = ObjectEndpointStreaming
 .put(bucket, keyPath, length, replicationConfig, getChunkSize(),
-customMetadata, tags, multiDigestInputStream, getHeaders(), 
signatureInfo.isSignPayload(), perf);
+customMetadata, tags, multiDigestInputStream, getHeaders(),
+signatureInfo.isSignPayload(), perf);

Review Comment:
   PutObject using streaming write should be able to handle conditional request 
since the OpenKey and CommitKey operations are the same as the non-streaming 
put.
   
   We can then remove the `hasConditionalHeaders` boolean.



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -271,6 +280,15 @@ Response handlePutRequest(ObjectRequestContext context, 
String keyPath, InputStr
 return Response.ok().status(HttpStatus.SC_OK).build();
   }
 
+  String ifNoneMatch = getHeaders().getHeaderString(
+  S3Consts.IF_NONE_MATCH_HEADER);
+  String ifMatch = getHeaders().getHeaderString(
+  S3Consts.IF_MATCH_HEADER);
+
+  if (ifNoneMatch != null && ifMatch != null) {
+throw newError(INVALID_REQUEST, keyPath);
+  }

Review Comment:
   Can put an error message here since the `INVALID_REQUEST` is used in a lot 
of places.
   
   Also we can do more validation
   - Fail for blank header
   - Fail if `If-None-Match` is not "*" 



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+  OzoneBucket bucket, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+} else if (ifMatch != null) {
+  String expectedETag = parseETag(ifMatch);
+  return getClientProtocol().rewriteKeyIfMatch(
+  volumeName, bucketName, keyPath, length, expectedETag,
+  replicationConfig, customMetadata, tags);
+} else {

Review Comment:
   This seems to be valid behavior based on the 
https://datatracker.ietf.org/doc/html/rfc7232#section-3.1. Should we implement 
this?



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,43 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String 
bucketName, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(stripQuotes(ifNoneMatch.trim( {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+} else if (ifMatch != null) {
+  String expectedETag = parseETag(ifMatch);
+  return getClientProtocol().rewriteKeyIfMatch(
+  volumeName, bucketName, keyPath, length, expectedETag,
+  replicationConfig, customMetadata, tags);
+} else {
+  return getClientProtocol().createKey(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+}
+  }
+
+  /**
+   * Parses an ETag from a conditional 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3019238708


##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,
+  String bucketName, String keyName, long size,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+  openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+  }
+
+  @Override
+  @SuppressWarnings("checkstyle:parameternumber")
+  public OzoneOutputStream rewriteKeyIfMatch(String volumeName,

Review Comment:
   I refactor to let create/rewrite key type request to share same keyArgs 
builder, but I keep those RpcClient interface separate their responsibility eg. 
`rewriteKey()` for generation-based CAS and rewriteKeyIfMatch for ETAG-based CAS



##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,

Review Comment:
   https://github.com/apache/ozone/pull/9815#discussion_r3019238708



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-31 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3017185075


##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
   // This allows a key to be created an committed atomically if the original 
has not
   // been modified.
 optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;

Review Comment:
   good point. let's refactor the `expectedDataGeneration` and `expectedETag` 
into another dedicated proto message as follow-up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-30 Thread via GitHub


ivandika3 commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3013645782


##
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto:
##
@@ -1176,6 +1185,11 @@ message KeyInfo {
   // This allows a key to be created an committed atomically if the original 
has not
   // been modified.
 optional uint64 expectedDataGeneration = 22;
+
+// expectedETag, when set, indicates that the existing key must have
+// the given ETag for the operation to succeed. This is used for
+// S3 conditional writes with the If-Match header.
+optional string expectedETag = 23;

Review Comment:
   Hm, I don't recall that we need to add `expectedDataGeneration` and 
`expectedETag` in `KeyInfo`. Adding the two fields in KeyInfo requires 
additional logic to null the two fields during commit to prevent space 
overhead, but adding fields on two fields that will always be null in the 
keyTable does not seem to be right. However, I don't think we can avoid it 
since without setting the fields in the openKey, there might be concurrent 
PutObjects that might violate the serial consistency guarantee. Ideally, we 
should use a separate `OpenKeyInfo` for `openKeyTable` to differentiate with 
the final `KeyInfo` in the `keyTable`, but I guess the ship has sailed a long 
time ago.
   
   Please let me know what you think. I'm OK if there are no other ways.



##
hadoop-ozone/integration-test-s3/src/test/java/org/apache/hadoop/ozone/s3/awssdk/v1/AbstractS3SDKV1Tests.java:
##
@@ -377,6 +377,103 @@ public void testPutObject() {
 assertEquals("37b51d194a7513e45b56f6524f2d51f2", 
putObjectResult.getETag());
   }
 
+  @Test
+  public void testPutObjectIfNoneMatch() {
+final String bucketName = getBucketName();
+final String keyName = getKeyName();
+final String content = "bar";
+s3Client.createBucket(bucketName);
+
+InputStream is = new 
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+ObjectMetadata metadata = new ObjectMetadata();
+metadata.setHeader("If-None-Match", "*");
+
+PutObjectResult putObjectResult = s3Client.putObject(bucketName, keyName, 
is, metadata);
+assertEquals("37b51d194a7513e45b56f6524f2d51f2", 
putObjectResult.getETag());
+  }
+
+  @Test
+  public void testPutObjectIfNoneMatchFail() {
+final String bucketName = getBucketName();
+final String keyName = getKeyName();
+final String content = "bar";
+s3Client.createBucket(bucketName);
+
+InputStream is = new 
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+s3Client.putObject(bucketName, keyName, is, new ObjectMetadata());
+
+InputStream is2 = new 
ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
+ObjectMetadata metadata = new ObjectMetadata();
+metadata.setHeader("If-None-Match", "*");
+
+AmazonServiceException ase = assertThrows(AmazonServiceException.class,
+() -> s3Client.putObject(bucketName, keyName, is2, metadata));
+
+assertEquals(ErrorType.Client, ase.getErrorType());
+assertEquals(412, ase.getStatusCode());
+assertEquals("PreconditionFailed", ase.getErrorCode());
+  }

Review Comment:
   Nit: Please also help to add the post validation suggested in the acceptance 
tests for relevant integration test methods (V1 and V2). Thanks.



##
hadoop-ozone/dist/src/main/smoketest/s3/conditionalput.robot:
##
@@ -0,0 +1,77 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+*** Settings ***
+Documentation   S3 Conditional Put (If-None-Match / If-Match) tests
+Library OperatingSystem
+Library String
+Library Process
+Resource../commonlib.robot
+Resource./commonawslib.robot
+Test Timeout5 minutes
+Suite Setup Setup s3 tests
+
+*** Variables ***
+${ENDPOINT_URL}   http://s3g:9878
+${BUCKET} generated
+
+*** Test Cases ***
+
+Conditional Put If-None-Match Star Creates New Key
+[Documentation]If-None-Match: * should succeed when key does not exist
+${key} =   Set Variablecondput-ifnonematch-new
+   Execute echo "test-content" > /tmp/${key}
+${r

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-30 Thread via GitHub


jojochuang commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3012857605


##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,
+  String bucketName, String keyName, long size,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+  openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+  }
+
+  @Override
+  @SuppressWarnings("checkstyle:parameternumber")
+  public OzoneOutputStream rewriteKeyIfMatch(String volumeName,

Review Comment:
   looks like we can refactor & combine rewriteKey()  and rewriteKeyIfMatch() 



##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1435,6 +1435,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,

Review Comment:
   this method looks very much the same as createKey(), except the addition of 
expectedDataGeneration field. We should refactor & merge them into the same 
method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-30 Thread via GitHub


Copilot commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r3012836701


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+  OzoneBucket bucket, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);

Review Comment:
   If-None-Match handling only recognizes an unquoted `*` and otherwise 
silently falls back to an unconditional createKey(). Per the design doc, 
clients may send `If-None-Match: "*"`; in that case this code would incorrectly 
allow overwrites. Consider normalizing/parsing the header value (similar to 
parseETag) and rejecting unsupported If-None-Match values instead of ignoring 
them.
   ```suggestion
   if (ifNoneMatch != null) {
 // Normalize the If-None-Match header value, accepting both quoted and
 // unquoted "*" and rejecting any other value instead of silently
 // falling back to an unconditional createKey().
 String normalizedIfNoneMatch = parseETag(ifNoneMatch);
 if ("*".equals(normalizedIfNoneMatch)) {
   return getClientProtocol().createKeyIfNotExists(
   volumeName, bucketName, keyPath, length, replicationConfig,
   customMetadata, tags);
 } else {
   throw new OS3Exception(
   PRECOND_FAILED,
   "Unsupported If-None-Match header value",
   HttpStatus.SC_PRECONDITION_FAILED);
 }
   ```



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+  OzoneBucket bucket, String keyPath, long length,

Review Comment:
   The openKeyForPut(...) helper takes an OzoneBucket parameter but does not 
use it. Removing the unused parameter will simplify the signature and avoid 
confusion about whether bucket-specific info is required for conditional puts.
   ```suggestion
 String keyPath, long length,
   ```



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1127,7 +1148,48 @@ private CopyObjectResponse copyObject(OzoneVolume volume,
   }
 }
   }
-  
+
+  /**
+   * Opens a key for put, applying conditional write logic based on
+   * If-None-Match and If-Match headers.
+   */
+  @SuppressWarnings("checkstyle:ParameterNumber")
+  private OzoneOutputStream openKeyForPut(String volumeName, String bucketName,
+  OzoneBucket bucket, String keyPath, long length,
+  ReplicationConfig replicationConfig, Map customMetadata,
+  Map tags, String ifNoneMatch, String ifMatch)
+  throws IOException {
+if (ifNoneMatch != null && "*".equals(ifNoneMatch.trim())) {
+  return getClientProtocol().createKeyIfNotExists(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+} else if (ifMatch != null) {
+  String expectedETag = parseETag(ifMatch);
+  return getClientProtocol().rewriteKeyIfMatch(
+  volumeName, bucketName, keyPath, length, expectedETag,
+  replicationConfig, customMetadata, tags);
+} else {
+  return getClientProtocol().createKey(
+  volumeName, bucketName, keyPath, length, replicationConfig,
+  customMetadata, tags);
+}
+  }
+
+  /**
+   * Parses an ETag from a conditional header value, removing surrounding
+   * quotes if present.
+   */
+  static String parseETag(String headerValue) {
+if (headerValue == null) {
+  return null;
+}
+String etag = headerValue.trim();
+if (etag.startsWith("\"") && etag.endsWith("\"")) {
+  return etag.substring(1, etag.length() - 1);
+}
+return etag;

Review Comment:
   parseETag duplicates quote-stripping logic that already exists as 
S3Utils.stripQuotes (and is statically imported in this class). Consider 
reusing stripQuotes(headerValue.trim()) to keep ETag n

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-27 Thread via GitHub


jojochuang commented on PR #9815:
URL: https://github.com/apache/ozone/pull/9815#issuecomment-4144195448

   Please rebase @peterxcli 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-27 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2999725804


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -187,8 +188,18 @@ public Response put(
 throw newError(S3ErrorTable.NO_SUCH_BUCKET, bucketName, ex);
   } else if (ex.getResult() == ResultCodes.FILE_ALREADY_EXISTS) {
 throw newError(S3ErrorTable.NO_OVERWRITE, keyPath, ex);
+  } else if (ex.getResult() == ResultCodes.KEY_ALREADY_EXISTS) {
+throw newError(PRECOND_FAILED, keyPath, ex);

Review Comment:
   I think the `ex` here we are using to create the new error already contain 
the error info from OM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-25 Thread via GitHub


peterxcli commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2986933629


##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1428,6 +1428,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,
+  String bucketName, String keyName, long size,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+  openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+  }
+
+  @Override
+  @SuppressWarnings("checkstyle:parameternumber")
+  public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
+  String bucketName, String keyName, long size, String expectedETag,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support conditional key rewrite.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedETag(expectedETag);

Review Comment:
   no, please take a look at the design doc:
   
https://github.com/apache/ozone/blob/5e5243eca2a28ba5127be5e10bba97a99adf9d52/hadoop-hdds/docs/content/design/s3-conditional-requests.md?plain=1#L144-L167



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-03-06 Thread via GitHub


jojochuang commented on code in PR #9815:
URL: https://github.com/apache/ozone/pull/9815#discussion_r2898576271


##
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/exceptions/OMException.java:
##
@@ -267,13 +267,17 @@ public enum ResultCodes {
 UNAUTHORIZED,
 
 S3_SECRET_ALREADY_EXISTS,
-
+

Review Comment:
   please do not commit changes that unrelated.



##
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##
@@ -1428,6 +1428,68 @@ public OzoneOutputStream rewriteKey(String volumeName, 
String bucketName, String
 return createOutputStream(openKey);
   }
 
+  @Override
+  public OzoneOutputStream createKeyIfNotExists(String volumeName,
+  String bucketName, String keyName, long size,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support atomic key creation.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedDataGeneration(
+OzoneConsts.EXPECTED_GEN_CREATE_IF_NOT_EXISTS);
+
+OpenKeySession openKey = ozoneManagerClient.openKey(builder.build());
+if (isS3GRequest.get() && size == 0) {
+  openKey.getKeyInfo().setDataSize(0);
+}
+return createOutputStream(openKey);
+  }
+
+  @Override
+  @SuppressWarnings("checkstyle:parameternumber")
+  public OzoneOutputStream rewriteKeyIfMatch(String volumeName,
+  String bucketName, String keyName, long size, String expectedETag,
+  ReplicationConfig replicationConfig, Map metadata,
+  Map tags) throws IOException {
+if (omVersion.compareTo(OzoneManagerVersion.ATOMIC_REWRITE_KEY) < 0) {
+  throw new IOException(
+  "OzoneManager does not support conditional key rewrite.");
+}
+
+createKeyPreChecks(volumeName, bucketName, keyName, replicationConfig);
+
+OmKeyArgs.Builder builder = new OmKeyArgs.Builder()
+.setVolumeName(volumeName)
+.setBucketName(bucketName)
+.setKeyName(keyName)
+.setDataSize(size)
+.setReplicationConfig(replicationConfig)
+.addAllMetadataGdpr(metadata)
+.addAllTags(tags)
+.setLatestVersionLocation(getLatestVersionLocation)
+.setExpectedETag(expectedETag);

Review Comment:
   does it set setExpectedDataGeneration here?



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -187,8 +188,18 @@ public Response put(
 throw newError(S3ErrorTable.NO_SUCH_BUCKET, bucketName, ex);
   } else if (ex.getResult() == ResultCodes.FILE_ALREADY_EXISTS) {
 throw newError(S3ErrorTable.NO_OVERWRITE, keyPath, ex);
+  } else if (ex.getResult() == ResultCodes.KEY_ALREADY_EXISTS) {
+throw newError(PRECOND_FAILED, keyPath, ex);

Review Comment:
   would you like to consider having different error messages for different 
cases?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2026-01-08 Thread via GitHub


github-actions[bot] commented on PR #9334:
URL: https://github.com/apache/ozone/pull/9334#issuecomment-3726444723

   This PR has been marked as stale due to 21 days of inactivity. Please 
comment or remove the stale label to keep it open. Otherwise, it will be 
automatically closed in 7 days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2629162910


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


sodonnel commented on PR #9334:
URL: https://github.com/apache/ozone/pull/9334#issuecomment-3667108609

   @errose28 and @kerneltime You guys had some interest in a fuller 
implementation of the "conditional write" api when I was doing the atomic 
rewrite change. Would be good to get your thoughts on this design doc too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627158712


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627142467


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627135904


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627135904


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627120613


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2627120613


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


sodonnel commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2626769568


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists s

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-17 Thread via GitHub


sodonnel commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2626759201


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists s

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-16 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2625676245


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-15 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2621708465


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-15 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2621425881


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-15 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2621425881


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,194 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KET_GENERATION_MISMATCH`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists 

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-13 Thread via GitHub


peterxcli commented on PR #9334:
URL: https://github.com/apache/ozone/pull/9334#issuecomment-3650374609

   > Thanks for iterating for this, the direction is good. Should we separate 
the design docs and the actual implementations?
   
   Sounds good. I’ll first revert the code changes, then we can merge this 
design doc first.
   
   > Let's be more permissive and not fail the precondition if ETag metadata 
does not exist
   
   Agreed. will update the doc accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-12-13 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2616667929


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,180 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## Specification
+
+### AWS S3 Conditional Write Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### AWS S3 Conditional Read Specification
+
+TODO
+
+### AWS S3 Conditional Copy Specification
+
+TODO
+
+## Implementation
+
+### AWS S3 Conditional Write Implementation
+
+The implementation aims to minimize Redundant RPCs (RTT) while ensuring strict 
atomicity for conditional operations.
+
+- **If-None-Match** utilizes the atomic "Create-If-Not-Exists" capability 
([HDDS-13963](https://issues.apache.org/jira/browse/HDDS-13963 "null")).
+- **If-Match** optimizes the happy path by pushing ETag validation directly 
into the Ozone Manager's write path, avoiding preliminary read operations.
+
+ If-None-Match Implementation
+
+This implementation ensures strict create-only semantics by utilizing a 
specific generation ID marker.
+
+In `OzoneConsts.java`, add the `-1` as a constant for readability:
+```java
+/**
+ * Special value for expectedDataGeneration to indicate "Create-If-Not-Exists" 
semantics.
+ * When used with If-None-Match conditional requests, this ensures atomicity:
+ * if a concurrent write commits between Create and Commit phases, the commit
+ * fails the validation check, preserving strict create-if-not-exists 
semantics.
+ */
+public static final long EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1L;
+```
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. OM receives request with `expectedDataGeneration == 
OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS`.
+2. **Pre-check**: If key is already in the OpenKeyTable or KeyTable, throw 
`KEY_ALREADY_EXISTS`.
+3. If not exists, proceed to create the open key entry.
+
+# OM Commit Phase (Atomicity)
+
+1. During the commit phase (or strict atomic create), the OM validates that 
the key still does not exist.
+2. If a concurrent client created the key between the Create and Commit 
phases, the transaction fails with `KEY_ALREADY_EXISTS`.
+
+# Race Condition Handling
+
+Using `OzoneConsts.EXPECTED_DATA_GENERATION_CREATE_IF_NOT_EXISTS = -1` ensures 
atomicity. If a concurrent write (Client B) commits between Client A's Create 
and Commit,
+Client A's commit fails the `CREATE IF NOT EXISTS` validation check, 
preserving strict create-if-not-exists semanti

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-30 Thread via GitHub


peterxcli commented on PR #9334:
URL: https://github.com/apache/ozone/pull/9334#issuecomment-3592409850

   @ivandika3 @chungen0126 I’ve refined the design—please take another look.
   
   ---
   Regarding the TODO: I plan to evolve the design and code together across 
patches:
   1) Initial patch: introduce the design, fully detail “conditional write,” 
and outline high-level approaches for get/copy.
   2) Conditional get: complete the remaining design details for conditional 
get and include the corresponding code changes.
   3) Conditional copy: complete the remaining design details for conditional 
copy and include the corresponding code changes.
   
   Let me know if this workflow sounds feasible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-26 Thread via GitHub


chungen0126 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2564347509


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   > preExecute can be called in parallel (in multiple OM handler threads), so 
we should instead verify the ETag in validateAndUpdateCache instead to ensure 
atomicity (i.e. if there are two identical "If-Match" requests with the same 
ETag, only one will succeed). Note that permission check was put to preExecute 
for performance reasons and the community discussed that consistency tradeoff 
is acceptable.
   
   I see. Initially, I added the pre-check specifically to `preExecute` to 
handle `S3 412 Precondition Failed` scenarios explicitly. 
   
   However, I realized that this leads to redundant read operations. I agree 
with your point suggestion to consolidate all the verification logic into 
`validateAndUpdateCache()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: issues-unsubscr...

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-26 Thread via GitHub


chungen0126 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2564347509


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   > preExecute can be called in parallel (in multiple OM handler threads), so 
we should instead verify the ETag in validateAndUpdateCache instead to ensure 
atomicity (i.e. if there are two identical "If-Match" requests with the same 
ETag, only one will succeed). Note that permission check was put to preExecute 
for performance reasons and the community discussed that consistency tradeoff 
is acceptable.
   
   I see. Initially, I added the pre-check specifically to `preExecute` to 
handle S3 412 Precondition Failed scenarios explicitly. 
   
   However, I realized that this leads to redundant read operations. I agree 
with your point suggestion to consolidate all the verification logic into 
`validateAndUpdateCache()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: issues-unsubscr...@o

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-26 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   > Note that verifying ETag during the preexecute phase does not increase the 
overhead of writing to the Raft log, so we don't need to worry about that.
   
   preExecute can be called in parallel (in multiple OM handler threads), so we 
should instead verify the ETag in `validateAndUpdateCache` instead to ensure 
atomicity (i.e. if there are two identical "If-Match" requests with the same 
ETag, only one will succeed).  Note that permission check was put to preExecute 
for performance reasons and the community discussed that consistency tradeoff 
is acceptable.
   
   > agree that we need to reduce the RTT for If-Match request, my original 
thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, 
but seems there are already a lots of them, I think it's ok to do so, plus the 
performance would be better.
   
   Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used 
only for s3 use case.
   
   > So If-Match request dont need the atomic key rewrite anymore. But how 
about we keep the if-none-match request to use the atomic with extended "CREATE 
IF NOT EXIST" capability, which will be added in 
https://github.

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-26 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   > Note that verifying ETag during the preexecute phase does not increase the 
overhead of writing to the Raft log, so we don't need to worry about that.
   
   preExecute can be called in parallel (in multiple OM handler threads), so we 
should instead verify the ETag in `validateAndUpdateCache` instead to ensure 
atomicity.  Note that permission check was put to preExecute for performance 
reasons and the community discussed that consistency tradeoff is acceptable.
   
   > agree that we need to reduce the RTT for If-Match request, my original 
thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, 
but seems there are already a lots of them, I think it's ok to do so, plus the 
performance would be better.
   
   Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used 
only for s3 use case.
   
   > So If-Match request dont need the atomic key rewrite anymore. But how 
about we keep the if-none-match request to use the atomic with extended "CREATE 
IF NOT EXIST" capability, which will be added in 
https://github.com/apache/ozone/pull/9332
   
   Yes, I'm OK with reusing atomic rewrite for "if-none-match" so w

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-26 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2564207037


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   > Note that verifying ETag during the preexecute phase does not increase the 
overhead of writing to the Raft log, so we don't need to worry about that.
   
   preExecute can be called in parallel (in multiple OM handler threads), so we 
should instead verify the ETag in `validateAndUpdateCache` instead to ensure 
atomicity.  Note that permission check was put to preExecute for performance 
reasons and the community discussed that consistency tradeoff is acceptable.
   
   > agree that we need to reduce the RTT for If-Match request, my original 
thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, 
but seems there are already a lots of them, I think it's ok to do so, plus the 
performance would be better.
   
   Yes, we already have multipart uploads and `OmKeyInfo.tags` that is used 
only for s3 use case.
   
   > So If-Match request dont need the atomic key rewrite anymore. But how 
about we keep the if-none-match request to use the atomic with extended "CREATE 
IF NOT EXIST" capability, which will be added in 
https://github.com/apache/ozone/pull/9332
   
   Yes, I'm OK with reusing atomic rewrite for "if-none-match" so w

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-25 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2563334450


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   @ivandika3 @chungen0126 Thanks for reviewing.
   1. agree that we need to reduce the RTT for `If-Match` request, my original 
thinking is that I want to avoid the "concepts of S3" appear in Ozone Manager, 
but seems there are already a lots of them, I think it's ok to do so, plus the 
performance would be better.
   2. So `If-Match` request dont need the atomic key rewrite anymore. But how 
about we keep the `if-none-match` request to use the atomic with extended 
"CREATE IF NOT EXIST" capability, which will be added in 
https://github.com/apache/ozone/pull/9332 cc @sodonnel 
   
   BTW, sorry im a little bit busy these day, I'll refine the design ASAP.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-25 Thread via GitHub


chungen0126 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2559512853


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   I guess we are trying to minimize changes, which is why we are adopting the 
existing object store APIs.
   
   But you're right, reducing redundant RPCs is a great idea.
   
   Note that verifying ETag during the preexecute phase does not increase the 
overhead of writing to the Raft log, so we don't need to worry about that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-24 Thread via GitHub


peterxcli commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2558648096


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.

Review Comment:
   You're right. Thanks for catching that! I'll remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-24 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2558578333


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.
+
+# OM Commit Phase
+
+1. Check `expectedDataGeneration == -1` from open key.
+2. If key now exists (race condition) → throw `KEY_ALREADY_EXISTS`.
+3. Commit key.
+
+# Race Condition Handling
+
+Using `-1` ensures atomicity. If a concurrent write (Client B) commits between 
Client A's Create and Commit, Client A's commit fails the `-1` validation check 
(key now exists), preserving strict create-if-not-exists semantics.
+
+ If-Match Implementation
+
+Leverages existing `expectedDataGeneration` from HDDS-10656:
+
+# S3 Gateway Layer
+
+1. Parse `If-Match: ""` header
+2. Look up existing key via `getS3KeyDetails()`
+3. Validate ETag matches, else throw `PRECOND_FAILED` (412)
+4. Extract `expectedGeneration` from existing key
+5. Pass `expectedGeneration` to RpcClient

Review Comment:
   Just for my understanding, the reason of calling `getS3KeyDetails` is to not 
send a write request if precondition failed and therefore there is no Raft log 
and will not block applier thread? 
   
   I think this tradeoff is whether we want to prioritize the optimize the 
latency happy path (precondition pass) or the precondition failed path 
(precondition failed). IMO, in normal workloads (and under optimistic 
concurrenc control), we assume that the happy path should happen more often and 
therefore we can validate the ETag key metadata during the key write. This will 
add another optional field of KeyArgs (e.g. `expectedETag`), but I think it's 
fine.
   
   Please also note that not all Ozone keys will have ETag (e.g. keys uploaded 
using OFS protocol), so we might want to specify whether we want to 1) skip the 
keys without ETag metadata or 2) calculate the ETag on the spot. I prefer (1) 
since it's the most lightweight implementation. Approach (2) might justify your 
approach of loading the key and calculating the ETag in S3G instead in OM 
applier thread, but there might be some overhead and also for MPU key the 
calculation of ETag is more c

Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-24 Thread via GitHub


ivandika3 commented on PR #9334:
URL: https://github.com/apache/ozone/pull/9334#issuecomment-3573833368

   @hevinhsu Could you help to take a look as well? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13919. S3 Conditional Writes (PutObject) [ozone]

2025-11-24 Thread via GitHub


ivandika3 commented on code in PR #9334:
URL: https://github.com/apache/ozone/pull/9334#discussion_r2558498272


##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operations**: Copy only if source/destination meets specific 
conditions
+- **Prevent overwrite**: Copy only if destination doesn't exist
+
+## AWS S3 Conditional Write
+
+### Specification
+
+ If-None-Match Header
+
+```
+If-None-Match: "*"
+```
+
+- Succeeds only if object does NOT exist
+- Returns `412 Precondition Failed` if object exists
+- Primary use case: Create-only semantics
+
+ If-Match Header
+
+```
+If-Match: ""
+```
+
+- Succeeds only if object EXISTS and ETag matches
+- Returns `412 Precondition Failed` if object doesn't exist or ETag mismatches
+- Primary use case: Atomic updates (compare-and-swap)
+
+ Restrictions
+
+- Cannot use both headers together in same request
+- No additional charges for failed conditional requests
+
+### Implementation
+
+ Architecture Overview
+
+ If-None-Match Implementation
+
+# S3 Gateway Layer
+
+1. Parse `If-None-Match: *`.
+2. Set `existingKeyGeneration = -1`.
+3. Call `RpcClient.rewriteKey()`.
+
+# OM Create Phase
+
+1. Validate `expectedDataGeneration == -1`.
+2. If key exists → throw `KEY_ALREADY_EXISTS`.
+3. Store `-1` in open key metadata.

Review Comment:
   Is it necessary to store `-1` on the open key metadata? I think we can set 
expectedGeneration in `BlockOutputStreamEntryPool.keyArgs` which will be 
checked by the `OmKeyCommitRequest`.



##
hadoop-hdds/docs/content/design/s3-conditional-requests.md:
##
@@ -0,0 +1,149 @@
+---
+title: "S3 Conditional Requests"
+summary: Design to support S3 conditional requests for atomic operations.
+date: 2025-11-20
+jira: HDDS-13117
+status: draft
+author: Chu Cheng Li
+---
+
+# S3 Conditional Requests Design
+
+## Background
+
+AWS S3 supports conditional requests using HTTP conditional headers, enabling 
atomic operations, cache optimization, and preventing race conditions. This 
includes:
+
+- **Conditional Writes** (PutObject): `If-Match` and `If-None-Match` headers 
for atomic operations
+- **Conditional Reads** (GetObject, HeadObject): `If-Match`, `If-None-Match`, 
`If-Modified-Since`, `If-Unmodified-Since` for cache validation
+- **Conditional Copy** (CopyObject): Conditions on both source and destination 
objects
+
+### Current State
+
+- HDDS-10656 implemented atomic rewrite using `expectedDataGeneration`
+- OM HA uses single Raft group with single applier thread (Ratis 
StateMachineUpdater)
+- S3 gateway doesn't expose conditional headers to OM layer
+
+## Use Cases
+
+### Conditional Writes
+- **Atomic key rewrites**: Prevent race conditions when updating existing 
objects
+- **Create-only semantics**: Prevent accidental overwrites (`If-None-Match: *`)
+- **Optimistic locking**: Enable concurrent access with conflict detection
+- **Leader election**: Implement distributed coordination using S3 as backing 
store
+
+### Conditional Reads
+- **Bandwidth optimization**: Avoid downloading unchanged objects (304 Not 
Modified)
+- **HTTP caching**: Support standard browser/CDN caching semantics
+- **Conditional processing**: Only process objects that meet specific criteria
+
+### Conditional Copy
+- **Atomic copy operati