On Wed, Aug 11, 2021 at 10:30 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Tue, Aug 10, 2021 at 8:08 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> 
> wrote:
> >
> > On 2021-Jul-30, Amit Kapila wrote:
> >
> > Reading Dilip's last posted patch that day, I had some reservations
> > about the API of ExtractReplicaIdentity.  The new argument makes for a
> > very strange to explain behavior "return the key values if they are
> > unchanged, *or* if they are toasted" ... ???
> >
>
> I think we can say it as "Return the key values if they are changed
> *or* if they are toasted". Currently, we have an exception for Deletes
> where the caller always passed key_changed as true, so maybe we can
> have a similar exception when the tuple has toasted values. Can we
> think of changing the flag to "key_required" instead of "key_changed"
> and let the caller identify and set its value? For Deletes, it will
> work the same but for Updates, the caller needs to compute it by
> checking if any of the key columns are modified or has a toast value.
> We can try to see if the caller can identify it cheaply along with
> determining the modified_attrs as at that time we will anyway check
> replica key attrs.

Right

>
> Currently, in proposed patch first, we check that the tuple has any
> toast values and then it deforms and forms the new key tuple. After
> that, it checks if the key has any toast values and then only decides
> to return the tuple. If as described in the previous paragraph, we can
> cheaply identify whether the key has toasted values, then we can avoid
> deform/form cost in some cases. Also, I think we need to change the
> Replica Identity description in the docs[1].

Yeah we can avoid that by detecting any toasted replica identity key
in HeapDetermineModifiedColumns, check the attached patch.



--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 3879809bf1a995995dccf5064eb9282f72af0412 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Wed, 11 Aug 2021 18:09:25 +0530
Subject: [PATCH v4] Extract unchanged external replica identity key

If replica identity is set to key and the key is not modified we don't log
key seperately because it should be logged along with the updated tuple.  But
if the key is stored externally we must have to detoast and log it separately.
---
 contrib/test_decoding/expected/toast.out |  2 +-
 src/backend/access/heap/heapam.c         | 51 ++++++++++++++++++++++----------
 2 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out
index 75c4d22..769ab0e 100644
--- a/contrib/test_decoding/expected/toast.out
+++ b/contrib/test_decoding/expected/toast.out
@@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot',
  table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
  COMMIT
  BEGIN
- table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109
+ table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
  COMMIT
  BEGIN
  table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2433998..e788926 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -80,7 +80,9 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
 								  bool all_visible_cleared, bool new_all_visible_cleared);
 static Bitmapset *HeapDetermineModifiedColumns(Relation relation,
 											   Bitmapset *interesting_cols,
-											   HeapTuple oldtup, HeapTuple newtup);
+											   Bitmapset *check_external_attr,
+											   HeapTuple oldtup, HeapTuple newtup,
+											   bool *id_has_external);
 static bool heap_acquire_tuplock(Relation relation, ItemPointer tid,
 								 LockTupleMode mode, LockWaitPolicy wait_policy,
 								 bool *have_tuple_lock);
@@ -106,7 +108,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status
 static void index_delete_sort(TM_IndexDeleteOp *delstate);
 static int	bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate);
 static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
-static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_changed,
+static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_required,
 										bool *copy);
 
 
@@ -3185,6 +3187,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 	bool		all_visible_cleared_new = false;
 	bool		checked_lockers;
 	bool		locker_remains;
+	bool		id_has_external = false;
 	TransactionId xmax_new_tuple,
 				xmax_old_tuple;
 	uint16		infomask_old_tuple,
@@ -3282,7 +3285,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 
 	/* Determine columns modified by the update. */
 	modified_attrs = HeapDetermineModifiedColumns(relation, interesting_attrs,
-												  &oldtup, newtup);
+												  id_attrs, &oldtup,
+												  newtup, &id_has_external);
 
 	/*
 	 * If we're not updating any "key" column, we can grab a weaker lock type.
@@ -3883,10 +3887,12 @@ l2:
 	 * Compute replica identity tuple before entering the critical section so
 	 * we don't PANIC upon a memory allocation failure.
 	 * ExtractReplicaIdentity() will return NULL if nothing needs to be
-	 * logged.
+	 * logged.  Pass old key required as true only if the replica identity key
+	 * columns are modified or it has external data.
 	 */
 	old_key_tuple = ExtractReplicaIdentity(relation, &oldtup,
-										   bms_overlap(modified_attrs, id_attrs),
+										   bms_overlap(modified_attrs, id_attrs) ||
+										   id_has_external,
 										   &old_key_copied);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
@@ -4047,7 +4053,8 @@ l2:
  */
 static bool
 heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
-					   HeapTuple tup1, HeapTuple tup2)
+					   HeapTuple tup1, HeapTuple tup2,
+					   bool *old_has_external)
 {
 	Datum		value1,
 				value2;
@@ -4083,6 +4090,13 @@ heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
 	value1 = heap_getattr(tup1, attrnum, tupdesc, &isnull1);
 	value2 = heap_getattr(tup2, attrnum, tupdesc, &isnull2);
 
+	/* check whether the value is stored externally or not and set the flag */
+	if (!isnull1 && TupleDescAttr(tupdesc, attrnum - 1)->attlen == -1 &&
+		VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)))
+		*old_has_external = true;
+	else
+		*old_has_external = false;
+
 	/*
 	 * If one value is NULL and other is not, then they are certainly not
 	 * equal
@@ -4129,19 +4143,25 @@ heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
  */
 static Bitmapset *
 HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
-							 HeapTuple oldtup, HeapTuple newtup)
+							 Bitmapset *check_external_attr,
+							 HeapTuple oldtup, HeapTuple newtup,
+							 bool *id_has_external)
 {
 	int			attnum;
 	Bitmapset  *modified = NULL;
+	bool		has_external = false;
 
 	while ((attnum = bms_first_member(interesting_cols)) >= 0)
 	{
 		attnum += FirstLowInvalidHeapAttributeNumber;
 
 		if (!heap_tuple_attr_equals(RelationGetDescr(relation),
-									attnum, oldtup, newtup))
+									attnum, oldtup, newtup, &has_external))
 			modified = bms_add_member(modified,
 									  attnum - FirstLowInvalidHeapAttributeNumber);
+		if (has_external && bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+										  check_external_attr))
+			*id_has_external = true;
 	}
 
 	return modified;
@@ -8300,14 +8320,15 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
  * Returns NULL if there's no need to log an identity or if there's no suitable
  * key defined.
  *
- * key_changed should be false if caller knows that no replica identity
- * columns changed value.  It's always true in the DELETE case.
+ * key_required should be false if caller knows that no replica identity
+ * columns changed value and it doesn't has any external data.
+ * It's always true in the DELETE case.
  *
  * *copy is set to true if the returned tuple is a modified copy rather than
  * the same tuple that was passed in.
  */
 static HeapTuple
-ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
+ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_required,
 					   bool *copy)
 {
 	TupleDesc	desc = RelationGetDescr(relation);
@@ -8340,7 +8361,7 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 	}
 
 	/* if the key hasn't changed and we're only logging the key, we're done */
-	if (!key_changed)
+	if (!key_required)
 		return NULL;
 
 	/* find out the replica identity columns */
@@ -8348,10 +8369,10 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 										 INDEX_ATTR_BITMAP_IDENTITY_KEY);
 
 	/*
-	 * If there's no defined replica identity columns, treat as !key_changed.
+	 * If there's no defined replica identity columns, treat as !key_required.
 	 * (This case should not be reachable from heap_update, since that should
-	 * calculate key_changed accurately.  But heap_delete just passes constant
-	 * true for key_changed, so we can hit this case in deletes.)
+	 * calculate key_required accurately.  But heap_delete just passes constant
+	 * true for key_required, so we can hit this case in deletes.)
 	 */
 	if (bms_is_empty(idattrs))
 		return NULL;
-- 
1.8.3.1

Reply via email to