On 10/17/25 12:32, Tomas Vondra wrote:
> 
> 
> On 10/17/25 10:31, Heikki Linnakangas wrote:
>> On 17/10/2025 06:13, Chao Li wrote:
>>> ```
>>> @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner,
>>> Datum value, const ResourceOwnerDesc
>>>       idx = hash_resource_elem(value, kind) & mask;
>>>       for (;;)
>>>       {
>>> +        /* found an exact match - just increment the counter */
>>> +        if ((owner->hash[idx].kind == kind) &&
>>> +            (owner->hash[idx].item == value))
>>> +        {
>>> +            owner->hash[idx].count += count;
>>> +            return;
>>> +        }
>>> +
>>>           if (owner->hash[idx].kind == NULL)
>>>               break;                /* found a free slot */
>>>           idx = (idx + 1) & mask;
>>>       }
>>> ```
>>>
>>> I think this is the “trade-off” you mention in your email: if a free
>>> slot found earlier, then still gets duplicated entries. I have no
>>> concern to this “trade-off”, but please add a comment here, otherwise
>>> future readers may be confused, and might potentially consider that
>>> were a bug, without reading your original design email.
>>
>> +1 on such a comment. Here or maybe on the ResourceElem struct itself.
>>
> 
> Will do. It's probably appropriate to mention this deduplication in the
> comment at the beginning of the file.
> 

I added a comment to the file comment, and clarified the comment when
adding the hash table entry. But I tried not to be too verbose there.

>>>  typedef struct ResourceElem
>>>  {
>>>      Datum        item;
>>> +    uint32        count;                /* number of occurrences */
>>>      const ResourceOwnerDesc *kind;    /* NULL indicates a free hash
>>> table slot */
>>>  } ResourceElem;
>>
>> Hmm, the 'count' is not used when the entry is stored in the array.
>> Perhaps we should have a separate struct for array and hash elements
>> now. Keeping the array small helps it to fit in CPU caches.
>>
> 
> Agreed. I had the same idea yesterday, but I haven't done it yet.
> 

The attached v2 does that - it adds a separate ResourceHashElem for the
has table, and it works. But I'm not sure I like it very much, because
there are two places that relied on both the array and hash table using
the same struct to "walk" it the same way.

For ResourceOwnerSort() it's not too bad, but ResourceOwnerReleaseAll()
now duplicates most of the code. It's not terrible, but also not pretty.
I can't think of a better way, though.

>>> /*
>>>  * Forget that an object is owned by a ResourceOwner
>>>  *
>>>  * Note: If same resource ID is associated with the ResourceOwner more
>>> than
>>>  * once, one instance is removed.
>>>  *
>>>  * Note: Forgetting a resource does not guarantee that there is room to
>>>  * remember a new resource.  One exception is when you forget the most
>>>  * recently remembered resource; that does make room for a new
>>> remember call.
>>>  * Some code callers rely on that exception.
>>>  */
>>> void
>>> ResourceOwnerForget(ResourceOwner owner, Datum value, const
>>> ResourceOwnerDesc *kind)
>>
>> Does this break the exception noted in the above comment? I guess it
>> still holds because we don't deduplicate entries in the array. That's
>> very subtle, needs a comment at least.
>>
> 
> Right, it doesn't break the exception - as you say, it only applies to
> the hash, not to the array. I'll add a comment.
> 

Comment added.


regards

-- 
Tomas Vondra
From bbb37408e5496fd011cf460279cf9ba2533c4fd6 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <[email protected]>
Date: Thu, 16 Oct 2025 13:27:25 +0200
Subject: [PATCH v2] Deduplicate entries in ResourceOwner

We may track the same resource many times, but the current ResourceOwner
does not handle that efficiently. Each reference is a separate entry,
which leads to long linear searches in the simple hash table.

This adds a simple opportunistic deduplication. Entries get a counter,
tracking how many references they represent. It only applies to the
"hash" phase, not to the initial array. And when adding an entry to the
hash finds an empty slot first, it uses it (instead of continuing to
look for a possible match).

This means the deduplication is not perfect - there may be multiple
entries for the same value. But in practice it seems like a good trade
off. It won't affect cases with no duplicates, and it will help cases
with many of them.

The code also continues to grow the hash table as before. In some cases
it might be enough to deduplicate the contents. But it's hard to say in
advance, and it's not much cheaper as it still requires walking the
current hash table. So there's a risk of attempting deduplication, only
to find the hash still needs to be enlarged, doubling the cost. So we
just grow the hash table. In the worst case the memory usage is not
worse than without deduplication. If deduplication is effective, the
hash table grows later or not at all.
---
 src/backend/utils/resowner/resowner.c | 215 ++++++++++++++++++++------
 1 file changed, 166 insertions(+), 49 deletions(-)

diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index fca84ded6dd..055426c7761 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -15,6 +15,12 @@
  * a particular reference, you need to search both the array and the hash
  * table.
  *
+ * The hash table deduplicates entries in an opportunistic way. If it finds
+ * an entry for the same resource (same kind), it increments a counter instead.
+ * But if it finds an empty slot first, it uses the slot. If the table gets
+ * enlarged, those entries will be merged, but it's possible to have multiple
+ * entries for the same resource. The fixed-size array does not deduplicate.
+ *
  * The most frequent usage is that a resource is remembered, and forgotten
  * shortly thereafter.  For example, pin a buffer, read one tuple from it,
  * release the pin.  Linearly scanning the small array handles that case
@@ -67,6 +73,18 @@ typedef struct ResourceElem
 	const ResourceOwnerDesc *kind;	/* NULL indicates a free hash table slot */
 } ResourceElem;
 
+/*
+ * ResourceHashElem represents a reference associated with a resource owner,
+ * stored in the hash table. Similar to ResourceElem, but has a counter field
+ * tracking how many references it represents, to allow deduplication.
+ */
+typedef struct ResourceHashElem
+{
+ 	Datum		item;
+	uint32		count;				/* number of references */
+	const ResourceOwnerDesc *kind;	/* NULL indicates a free hash table slot */
+} ResourceHashElem;
+
 /*
  * Size of the fixed-size array to hold most-recently remembered resources.
  */
@@ -151,7 +169,7 @@ struct ResourceOwnerData
 	 * release priority.  The first 'nhash' elements are occupied, the rest
 	 * are empty.
 	 */
-	ResourceElem *hash;
+	ResourceHashElem *hash;
 	uint32		capacity;		/* allocated length of hash[] */
 	uint32		grow_at;		/* grow hash when reach this */
 
@@ -198,7 +216,8 @@ static ResourceReleaseCallbackItem *ResourceRelease_callbacks = NULL;
 /* Internal routines */
 static inline uint32 hash_resource_elem(Datum value, const ResourceOwnerDesc *kind);
 static void ResourceOwnerAddToHash(ResourceOwner owner, Datum value,
-								   const ResourceOwnerDesc *kind);
+								   const ResourceOwnerDesc *kind,
+								   uint32 count);
 static int	resource_priority_cmp(const void *a, const void *b);
 static void ResourceOwnerSort(ResourceOwner owner);
 static void ResourceOwnerReleaseAll(ResourceOwner owner,
@@ -239,7 +258,8 @@ hash_resource_elem(Datum value, const ResourceOwnerDesc *kind)
  * Adds 'value' of given 'kind' to the ResourceOwner's hash table
  */
 static void
-ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc *kind)
+ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc *kind,
+					   uint32 count)
 {
 	uint32		mask = owner->capacity - 1;
 	uint32		idx;
@@ -250,12 +270,25 @@ ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc
 	idx = hash_resource_elem(value, kind) & mask;
 	for (;;)
 	{
+		/*
+		 * Increment counter for a matching entry or use an empty slot,
+		 * whichever we find first.
+		 */
+
+		if ((owner->hash[idx].kind == kind) &&
+			(owner->hash[idx].item == value))
+		{
+			owner->hash[idx].count += count;
+			return;
+		}
+
 		if (owner->hash[idx].kind == NULL)
 			break;				/* found a free slot */
 		idx = (idx + 1) & mask;
 	}
 	owner->hash[idx].item = value;
 	owner->hash[idx].kind = kind;
+	owner->hash[idx].count = count;
 	owner->nhash++;
 }
 
@@ -277,6 +310,24 @@ resource_priority_cmp(const void *a, const void *b)
 		return 1;
 }
 
+/*
+ * Comparison function to sort by release phase and priority
+ */
+static int
+resource_priority_hash_cmp(const void *a, const void *b)
+{
+	const ResourceHashElem *ra = (const ResourceHashElem *) a;
+	const ResourceHashElem *rb = (const ResourceHashElem *) b;
+
+	/* Note: reverse order */
+	if (ra->kind->release_phase == rb->kind->release_phase)
+		return pg_cmp_u32(rb->kind->release_priority, ra->kind->release_priority);
+	else if (ra->kind->release_phase > rb->kind->release_phase)
+		return -1;
+	else
+		return 1;
+}
+
 /*
  * Sort resources in reverse release priority.
  *
@@ -288,16 +339,21 @@ resource_priority_cmp(const void *a, const void *b)
 static void
 ResourceOwnerSort(ResourceOwner owner)
 {
-	ResourceElem *items;
-	uint32		nitems;
-
 	if (owner->nhash == 0)
 	{
+		ResourceElem *items;
+		uint32		nitems;
+
 		items = owner->arr;
 		nitems = owner->narr;
+
+		qsort(items, nitems, sizeof(ResourceElem), resource_priority_cmp);
 	}
 	else
 	{
+		ResourceHashElem   *items;
+		uint32				nitems;
+
 		/*
 		 * Compact the hash table, so that all the elements are in the
 		 * beginning of the 'hash' array, with no empty elements.
@@ -324,7 +380,9 @@ ResourceOwnerSort(ResourceOwner owner)
 		Assert(dst + owner->narr <= owner->capacity);
 		for (int idx = 0; idx < owner->narr; idx++)
 		{
-			owner->hash[dst] = owner->arr[idx];
+			owner->hash[dst].item = owner->arr[idx].item;
+			owner->hash[dst].kind = owner->arr[idx].kind;
+			owner->hash[dst].count = 1;
 			dst++;
 		}
 		Assert(dst == owner->nhash + owner->narr);
@@ -333,9 +391,9 @@ ResourceOwnerSort(ResourceOwner owner)
 
 		items = owner->hash;
 		nitems = owner->nhash;
-	}
 
-	qsort(items, nitems, sizeof(ResourceElem), resource_priority_cmp);
+		qsort(items, nitems, sizeof(ResourceHashElem), resource_priority_hash_cmp);
+	}
 }
 
 /*
@@ -345,9 +403,6 @@ static void
 ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
 						bool printLeakWarnings)
 {
-	ResourceElem *items;
-	uint32		nitems;
-
 	/*
 	 * ResourceOwnerSort must've been called already.  All the resources are
 	 * either in the array or the hash.
@@ -356,49 +411,93 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
 	Assert(owner->sorted);
 	if (owner->nhash == 0)
 	{
+		ResourceElem *items;
+		uint32		nitems;
+
 		items = owner->arr;
 		nitems = owner->narr;
+
+		/*
+		 * The resources are sorted in reverse priority order.  Release them
+		 * starting from the end, until we hit the end of the phase that we are
+		 * releasing now.  We will continue from there when called again for the
+		 * next phase.
+		 */
+		while (nitems > 0)
+		{
+			uint32		idx = nitems - 1;
+			Datum		value = items[idx].item;
+			const ResourceOwnerDesc *kind = items[idx].kind;
+
+			if (kind->release_phase > phase)
+				break;
+			Assert(kind->release_phase == phase);
+
+			if (printLeakWarnings)
+			{
+				char	   *res_str;
+
+				res_str = kind->DebugPrint ?
+					kind->DebugPrint(value)
+					: psprintf("%s %p", kind->name, DatumGetPointer(value));
+				pfree(res_str);
+			}
+
+			kind->ReleaseResource(value);
+
+			nitems--;
+		}
+
+		owner->narr = nitems;
 	}
 	else
 	{
+		ResourceHashElem *items;
+		uint32		nitems;
+
 		Assert(owner->narr == 0);
 		items = owner->hash;
 		nitems = owner->nhash;
-	}
 
-	/*
-	 * The resources are sorted in reverse priority order.  Release them
-	 * starting from the end, until we hit the end of the phase that we are
-	 * releasing now.  We will continue from there when called again for the
-	 * next phase.
-	 */
-	while (nitems > 0)
-	{
-		uint32		idx = nitems - 1;
-		Datum		value = items[idx].item;
-		const ResourceOwnerDesc *kind = items[idx].kind;
+		/*
+		 * The resources are sorted in reverse priority order.  Release them
+		 * starting from the end, until we hit the end of the phase that we are
+		 * releasing now.  We will continue from there when called again for the
+		 * next phase.
+		 */
+		while (nitems > 0)
+		{
+			uint32		idx = nitems - 1;
+			Datum		value = items[idx].item;
+			const ResourceOwnerDesc *kind = items[idx].kind;
+			uint32		count = items[idx].count;
 
-		if (kind->release_phase > phase)
-			break;
-		Assert(kind->release_phase == phase);
+			if (kind->release_phase > phase)
+				break;
+			Assert(kind->release_phase == phase);
 
-		if (printLeakWarnings)
-		{
-			char	   *res_str;
+			if (printLeakWarnings)
+			{
+				char	   *res_str;
+
+				res_str = kind->DebugPrint ?
+					kind->DebugPrint(value)
+					: psprintf("%s %p", kind->name, DatumGetPointer(value));
+				pfree(res_str);
+			}
+
+			/* invoke the release callback for each reference */
+			while (count > 0)
+			{
+				kind->ReleaseResource(value);
+				count--;
+			}
 
-			res_str = kind->DebugPrint ?
-				kind->DebugPrint(value)
-				: psprintf("%s %p", kind->name, DatumGetPointer(value));
-			elog(WARNING, "resource was not closed: %s", res_str);
-			pfree(res_str);
+			nitems--;
 		}
-		kind->ReleaseResource(value);
-		nitems--;
-	}
-	if (owner->nhash == 0)
-		owner->narr = nitems;
-	else
+
 		owner->nhash = nitems;
+	}
 }
 
 
@@ -466,16 +565,16 @@ ResourceOwnerEnlarge(ResourceOwner owner)
 		uint32		i,
 					oldcap,
 					newcap;
-		ResourceElem *oldhash;
-		ResourceElem *newhash;
+		ResourceHashElem *oldhash;
+		ResourceHashElem *newhash;
 
 		oldhash = owner->hash;
 		oldcap = owner->capacity;
 
 		/* Double the capacity (it must stay a power of 2!) */
 		newcap = (oldcap > 0) ? oldcap * 2 : RESOWNER_HASH_INIT_SIZE;
-		newhash = (ResourceElem *) MemoryContextAllocZero(TopMemoryContext,
-														  newcap * sizeof(ResourceElem));
+		newhash = (ResourceHashElem *) MemoryContextAllocZero(TopMemoryContext,
+														  newcap * sizeof(ResourceHashElem));
 
 		/*
 		 * We assume we can't fail below this point, so OK to scribble on the
@@ -496,7 +595,8 @@ ResourceOwnerEnlarge(ResourceOwner owner)
 			for (i = 0; i < oldcap; i++)
 			{
 				if (oldhash[i].kind != NULL)
-					ResourceOwnerAddToHash(owner, oldhash[i].item, oldhash[i].kind);
+					ResourceOwnerAddToHash(owner, oldhash[i].item, oldhash[i].kind,
+										   oldhash[i].count);
 			}
 
 			/* And release old hash table. */
@@ -506,7 +606,7 @@ ResourceOwnerEnlarge(ResourceOwner owner)
 
 	/* Move items from the array to the hash */
 	for (int i = 0; i < owner->narr; i++)
-		ResourceOwnerAddToHash(owner, owner->arr[i].item, owner->arr[i].kind);
+		ResourceOwnerAddToHash(owner, owner->arr[i].item, owner->arr[i].kind, 1);
 	owner->narr = 0;
 
 	Assert(owner->nhash <= owner->grow_at);
@@ -555,7 +655,8 @@ ResourceOwnerRemember(ResourceOwner owner, Datum value, const ResourceOwnerDesc
  * Note: Forgetting a resource does not guarantee that there is room to
  * remember a new resource.  One exception is when you forget the most
  * recently remembered resource; that does make room for a new remember call.
- * Some code callers rely on that exception.
+ * Some code callers rely on that exception. The deduplication does not
+ * change this, as it only applies to the hash table.
  */
 void
 ResourceOwnerForget(ResourceOwner owner, Datum value, const ResourceOwnerDesc *kind)
@@ -597,8 +698,17 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, const ResourceOwnerDesc *k
 			if (owner->hash[idx].item == value &&
 				owner->hash[idx].kind == kind)
 			{
+				/* decrement the count for entry for multiple references */
+				if (owner->hash[idx].count > 1)
+				{
+					owner->hash[idx].count--;
+					return;
+				}
+
+				/* remove the entry entirely */
 				owner->hash[idx].item = (Datum) 0;
 				owner->hash[idx].kind = NULL;
+				owner->hash[idx].count = 0;
 				owner->nhash--;
 
 #ifdef RESOWNER_STATS
@@ -847,12 +957,19 @@ ResourceOwnerReleaseAllOfKind(ResourceOwner owner, const ResourceOwnerDesc *kind
 		if (owner->hash[i].kind == kind)
 		{
 			Datum		value = owner->hash[i].item;
+			uint32		count = owner->hash[i].count;
 
 			owner->hash[i].item = (Datum) 0;
 			owner->hash[i].kind = NULL;
+			owner->hash[i].count = 0;
 			owner->nhash--;
 
-			kind->ReleaseResource(value);
+			/* invoke the release callback once for each reference */
+			while (count > 0)
+			{
+				kind->ReleaseResource(value);
+				count--;
+			}
 		}
 	}
 	owner->releasing = false;
-- 
2.51.0

Reply via email to