On Thu, Aug 1, 2024 at 5:23 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Thu, Aug 1, 2024 at 2:26 PM Hayato Kuroda (Fujitsu)
> <kuroda.hay...@fujitsu.com> wrote:
> >
> > 04. general
> >
> > According to the documentation [1], there is another constraint "exclude", 
> > which
> > can cause another type of conflict. But this pattern cannot be logged in 
> > detail.
> >
>
> As per docs, "exclusion constraints can specify constraints that are
> more general than simple equality", so I don't think it satisfies the
> kind of conflicts we are trying to LOG and then in the future patch
> allows automatic resolution for the same. For example, when we have
> last_update_wins strategy, we will replace the rows with remote rows
> when the key column values match which shouldn't be true in general
> for exclusion constraints. Similarly, we don't want to consider other
> constraint violations like CHECK to consider as conflicts. We can
> always extend the basic functionality for more conflicts if required
> but let's go with reporting straight-forward stuff first.
>

It is better to document that exclusion constraints won't be
supported. We can even write a comment in the code and or commit
message that we can extend it in the future.

*
+ * Return true if the commit timestamp data was found, false otherwise.
+ */
+bool
+GetTupleCommitTs(TupleTableSlot *localslot, TransactionId *xmin,
+ RepOriginId *localorigin, TimestampTz *localts)

This API returns both xmin and commit timestamp, so wouldn't it be
better to name it as GetTupleTransactionInfo or something like that?

I have made several changes in the attached top-up patch. These
include changes in the comments, docs, function names, etc.

-- 
With Regards,
Amit Kapila.
diff --git a/doc/src/sgml/logical-replication.sgml 
b/doc/src/sgml/logical-replication.sgml
index c1f6f1aaa8..dfbff3de02 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1585,17 +1585,17 @@ test_sub=# SELECT * FROM t1 ORDER BY id;
 
   <para>
    Additional logging is triggered in the following 
<firstterm>conflict</firstterm>
-   scenarios:
+   cases:
    <variablelist>
     <varlistentry>
      <term><literal>insert_exists</literal></term>
      <listitem>
       <para>
        Inserting a row that violates a <literal>NOT DEFERRABLE</literal>
-       unique constraint. Note that to obtain the origin and commit
-       timestamp details of the conflicting key in the log, ensure that
+       unique constraint. Note that to log the origin and commit
+       timestamp details of the conflicting key,
        <link 
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
-       is enabled. In this scenario, an error will be raised until the
+       should be enabled. In this case, an error will be raised until the
        conflict is resolved manually.
       </para>
      </listitem>
@@ -1605,10 +1605,10 @@ test_sub=# SELECT * FROM t1 ORDER BY id;
      <listitem>
       <para>
        The updated value of a row violates a <literal>NOT DEFERRABLE</literal>
-       unique constraint. Note that to obtain the origin and commit
-       timestamp details of the conflicting key in the log, ensure that
+       unique constraint. Note that to log the origin and commit
+       timestamp details of the conflicting key,
        <link 
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
-       is enabled. In this scenario, an error will be raised until the
+       is enabled. In this case, an error will be raised until the
        conflict is resolved manually. Note that when updating a
        partitioned table, if the updated row value satisfies another
        partition constraint resulting in the row being inserted into a
@@ -1720,10 +1720,10 @@ CONTEXT:  processing remote data for replication origin 
"pg_16395" during "INSER
    commit timestamp can be seen in the <literal>DETAIL</literal> line of the
    log. But note that this information is only available when
    <link 
linkend="guc-track-commit-timestamp"><varname>track_commit_timestamp</varname></link>
-   is enabled. Users can use these information to make decisions on whether to
-   retain the local change or adopt the remote alteration. For instance, the
-   <literal>DETAIL</literal> line in above log indicates that the existing row 
was
-   modified by a local change, users can manually perform a remote-change-win
+   is enabled. Users can use this information to decide whether to retain the
+   local change or adopt the remote alteration. For instance, the
+   <literal>DETAIL</literal> line in the above log indicates that the existing
+   row was modified locally. Users can manually perform a remote-change-win
    resolution by deleting the local row.
   </para>
 
diff --git a/src/backend/executor/execReplication.c 
b/src/backend/executor/execReplication.c
index ad3eda1459..34050a3bae 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -475,13 +475,13 @@ retry:
 }
 
 /*
- * Find the tuple that violates the passed in unique index constraint
- * (conflictindex).
+ * Find the tuple that violates the passed unique index (conflictindex).
  *
- * If no conflict is found, return false and set *conflictslot to NULL.
- * Otherwise return true, and the conflicting tuple is locked and returned in
- * *conflictslot. The lock is essential to allow retrying to find the
- * conflicting tuple in case the tuple is concurrently deleted.
+ * If the conflicting tuple is found return true, otherwise false.
+ *
+ * We lock the tuple to avoid getting it deleted before the caller can fetch
+ * the required information. Note that if the tuple is deleted before a lock
+ * is acquired, we will retry to find the conflicting tuple again.
  */
 static bool
 FindConflictTuple(ResultRelInfo *resultRelInfo, EState *estate,
@@ -528,25 +528,15 @@ retry:
 }
 
 /*
- * Re-check all the unique indexes in 'recheckIndexes' to see if there are
- * potential conflicts with the tuple in 'slot'.
- *
- * This function is invoked after inserting or updating a tuple that detected
- * potential conflict tuples. It attempts to find the tuple that conflicts with
- * the provided tuple. This operation may seem redundant with the unique
- * violation check of indexam, but since we call this function only when we are
- * detecting conflict in logical replication and encountering potential
- * conflicts with any unique index constraints (which should not be frequent),
- * so it's ok. Moreover, upon detecting a conflict, we will report an ERROR and
- * restart the logical replication, so the additional cost of finding the tuple
- * should be acceptable.
+ * Check all the unique indexes in 'recheckIndexes' for conflict with the
+ * tuple in 'slot' and report if found.
  */
 static void
-ReCheckConflictIndexes(ResultRelInfo *resultRelInfo, EState *estate,
+CheckAndReportConflict(ResultRelInfo *resultRelInfo, EState *estate,
                                           ConflictType type, List 
*recheckIndexes,
                                           TupleTableSlot *slot)
 {
-       /* Re-check all the unique indexes for potential conflicts */
+       /* Check all the unique indexes for a conflict */
        foreach_oid(uniqueidx, resultRelInfo->ri_onConflictArbiterIndexes)
        {
                TupleTableSlot *conflictslot;
@@ -622,17 +612,22 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
                                                                                
                   conflictindexes, false);
 
                /*
-                * Rechecks the conflict indexes to fetch the conflicting local 
tuple
+                * Checks the conflict indexes to fetch the conflicting local 
tuple
                 * and reports the conflict. We perform this check here, 
instead of
-                * perform an additional index scan before the actual insertion 
and
+                * performing an additional index scan before the actual 
insertion and
                 * reporting the conflict if any conflicting tuples are found. 
This is
                 * to avoid the overhead of executing the extra scan for each 
INSERT
                 * operation, even when no conflict arises, which could 
introduce
                 * significant overhead to replication, particularly in cases 
where
                 * conflicts are rare.
+                *
+                * XXX OTOH, this could lead to clean-up effort for dead tuples 
added
+                * in heap and index in case of conflicts. But as conflicts 
shouldn't
+                * be a frequent thing so we preferred to save the performance 
overhead
+                * of extra scan before each insertion.
                 */
                if (conflict)
-                       ReCheckConflictIndexes(resultRelInfo, estate, 
CT_INSERT_EXISTS,
+                       CheckAndReportConflict(resultRelInfo, estate, 
CT_INSERT_EXISTS,
                                                                   
recheckIndexes, slot);
 
                /* AFTER ROW INSERT Triggers */
@@ -710,12 +705,12 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
                                                                                
                   (update_indexes == TU_Summarizing));
 
                /*
-                * Refer to the comments above the call to 
ReCheckConflictIndexes() in
+                * Refer to the comments above the call to 
CheckAndReportConflict() in
                 * ExecSimpleRelationInsert to understand why this check is 
done at
                 * this point.
                 */
                if (conflict)
-                       ReCheckConflictIndexes(resultRelInfo, estate, 
CT_UPDATE_EXISTS,
+                       CheckAndReportConflict(resultRelInfo, estate, 
CT_UPDATE_EXISTS,
                                                                   
recheckIndexes, slot);
 
                /* AFTER ROW UPDATE Triggers */
diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index b1dc04ec7e..bff98a236d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2668,8 +2668,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
                TimestampTz localts;
 
                /*
-                * Check whether the local tuple was modified by a different 
origin.
-                * If detected, report the conflict.
+                * Report the conflict if the tuple was modified by a different 
origin.
                 */
                if (GetTupleCommitTs(localslot, &localxmin, &localorigin, 
&localts) &&
                        localorigin != replorigin_session_origin)
@@ -2825,8 +2824,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
                TimestampTz localts;
 
                /*
-                * Check whether the local tuple was modified by a different 
origin.
-                * If detected, report the conflict.
+                * Report the conflict if the tuple was modified by a different 
origin.
                 */
                if (GetTupleCommitTs(localslot, &localxmin, &localorigin, 
&localts) &&
                        localorigin != replorigin_session_origin)
@@ -3038,8 +3036,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
                                }
 
                                /*
-                                * Check whether the local tuple was modified 
by a different
-                                * origin. If detected, report the conflict.
+                                * Report the conflict if the tuple was 
modified by a different origin.
                                 */
                                if (GetTupleCommitTs(localslot, &localxmin, 
&localorigin, &localts) &&
                                        localorigin != 
replorigin_session_origin)

Reply via email to