rdblue commented on a change in pull request #3199:
URL: https://github.com/apache/iceberg/pull/3199#discussion_r718939818
##########
File path: api/src/main/java/org/apache/iceberg/OverwriteFiles.java
##########
@@ -145,4 +151,50 @@
*/
@Deprecated
OverwriteFiles validateNoConflictingAppends(Long readSnapshotId, Expression
conflictDetectionFilter);
+
+ /**
+ * Sets a conflict detection filter used to validate concurrently added data
and delete files.
+ * <p>
+ * If not called, a true literal will be used as the conflict detection
filter.
+ *
+ * @param conflictDetectionFilter an expression on rows in the table
+ * @return this for method chaining
+ */
+ OverwriteFiles conflictDetectionFilter(Expression conflictDetectionFilter);
+
+ /**
+ * Enables validation that data files added concurrently do not conflict
with this commit's operation.
+ * <p>
+ * This method should be called while committing non-idempotent overwrite
operations.
+ * If a concurrent operation commits a new file after the data was read and
that file might
+ * contain rows matching the specified conflict detection filter, the
overwrite operation
+ * will detect this during retries and fail.
+ * <p>
+ * Calling this method with a correct conflict detection filter is required
to maintain
+ * serializable isolation for overwrite operations. Otherwise, the isolation
level
+ * will be snapshot isolation.
+ * <p>
+ * Validation uses the conflict detection filter passed to {@link
#conflictDetectionFilter(Expression)} and
+ * applies to operations that happened after the snapshot passed to {@link
#validateFromSnapshot(long)}.
+ *
+ * @return this for method chaining
+ */
+ OverwriteFiles validateNoConflictingDataFiles();
+
+ /**
+ * Enables validation that delete files added concurrently do not conflict
with this commit's operation.
+ * <p>
+ * Validating concurrently added delete files is required during
non-idempotent overwrite operations.
+ * If a concurrent operation adds a new delete file that applies to one of
the data files being overwritten,
+ * the overwrite operation must be aborted as it may undelete rows that were
removed concurrently.
+ * <p>
+ * Calling this method with a correct conflict detection filter is required
to maintain
+ * serializable isolation for overwrite operations.
Review comment:
Serializable and snapshot isolation? Maybe just say "required to
maintain isolation" for non-idempotent commits.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]