Re: Optimizing ResouceOwner to speed up COPY

2025-10-27 Thread Tomas Vondra
On 10/27/25 16:14, Tomas Vondra wrote: > On 10/21/25 16:43, Tomas Vondra wrote: >> ... >> >> The results seem fairly stable, and the overall trend is clear. It'd be >> great if there were no regressions, but considering how narrow is this >> microbenchmark (and considering the benefits for practica

Re: Optimizing ResouceOwner to speed up COPY

2025-10-27 Thread Tomas Vondra
On 10/21/25 16:43, Tomas Vondra wrote: > ... > > The results seem fairly stable, and the overall trend is clear. It'd be > great if there were no regressions, but considering how narrow is this > microbenchmark (and considering the benefits for practical COPY runs), > I'd say it's probably OK. >

Re: Optimizing ResouceOwner to speed up COPY

2025-10-21 Thread Tomas Vondra
On 10/21/25 09:10, Heikki Linnakangas wrote: > On 18/10/2025 01:49, Tomas Vondra wrote: >> On 10/17/25 12:32, Tomas Vondra wrote: >>> >>> >>> On 10/17/25 10:31, Heikki Linnakangas wrote: >   typedef struct ResourceElem >   { >   Datum    item; > +    uint32    count;  

Re: Optimizing ResouceOwner to speed up COPY

2025-10-21 Thread Heikki Linnakangas
On 18/10/2025 01:49, Tomas Vondra wrote: On 10/17/25 12:32, Tomas Vondra wrote: On 10/17/25 10:31, Heikki Linnakangas wrote:  typedef struct ResourceElem  { Datum    item; +    uint32    count;    /* number of occurrences */ const ResourceOwnerDesc *kind;   

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Tomas Vondra
On 10/16/25 21:28, Tom Lane wrote: > Tomas Vondra writes: >> On 10/16/25 20:12, Tom Lane wrote: >>> Can you find evidence of this change being helpful for anything >>> except this specific scenario in COPY? > >> I went through the ResourceOwnerRemember() calls, looking for other >> cases that mig

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Tom Lane
Tomas Vondra writes: > On 10/16/25 21:28, Tom Lane wrote: >> I was thinking of adding some temporary instrumentation, like >> just elog'ing whenever the count goes above 1, and seeing where >> you get hits during the regression tests. I'm prepared to believe >> this is worth doing, but it'd be ni

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Heikki Linnakangas
On 17/10/2025 06:13, Chao Li wrote: ``` @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc idx = hash_resource_elem(value, kind) & mask; for (;;) { + /* found an exact match - just increment the counter */

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Tom Lane
Tomas Vondra writes: > The reason is pretty simple - ResourceOwner tracks the resources in a > very simple hash table, with O(n^2) behavior with duplicates. This > happens with COPY, because COPY creates an array of a 1000 tuple slots, > and each slot references the same tuple descriptor. And the

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Chao Li
> On Oct 17, 2025, at 01:46, Tomas Vondra wrote: > > -- > Tomas Vondra Nice patch! I eyeball reviewed the patch, only got a few small comments: 1 ``` @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc idx = hash_resource_elem(valu

Re: Optimizing ResouceOwner to speed up COPY

2025-10-18 Thread Tomas Vondra
On 10/17/25 10:31, Heikki Linnakangas wrote: > On 17/10/2025 06:13, Chao Li wrote: >> ``` >> @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, >> Datum value, const ResourceOwnerDesc >>   idx = hash_resource_elem(value, kind) & mask; >>   for (;;) >>   { >> +    /

Optimizing ResouceOwner to speed up COPY

2025-10-17 Thread Tomas Vondra
Hi, While reviewing and testing a nearby patch (using COPY for batching in postgres_fdw), I noticed some of the COPY queries are spending a substantial amount of time in ResourceOwnerAddToHash(). The exact figure depends on amount of data in the COPY, but it was often close to 50% (according to pe

Re: Optimizing ResouceOwner to speed up COPY

2025-10-17 Thread Tomas Vondra
On 10/17/25 12:32, Tomas Vondra wrote: > > > On 10/17/25 10:31, Heikki Linnakangas wrote: >> On 17/10/2025 06:13, Chao Li wrote: >>> ``` >>> @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, >>> Datum value, const ResourceOwnerDesc >>>   idx = hash_resource_elem(value, kind) &

Re: Optimizing ResouceOwner to speed up COPY

2025-10-17 Thread Tomas Vondra
On 10/16/25 20:12, Tom Lane wrote: > Tomas Vondra writes: >> The reason is pretty simple - ResourceOwner tracks the resources in a >> very simple hash table, with O(n^2) behavior with duplicates. This >> happens with COPY, because COPY creates an array of a 1000 tuple slots, >> and each slot refer

Re: Optimizing ResouceOwner to speed up COPY

2025-10-17 Thread Tom Lane
Tomas Vondra writes: > On 10/16/25 20:12, Tom Lane wrote: >> Can you find evidence of this change being helpful for anything >> except this specific scenario in COPY? > I went through the ResourceOwnerRemember() calls, looking for other > cases that might create a lot of duplicates, similar to th

Re: Optimizing ResouceOwner to speed up COPY

2025-10-16 Thread Tomas Vondra
On 10/17/25 00:17, Tom Lane wrote: > Tomas Vondra writes: >> On 10/16/25 21:28, Tom Lane wrote: >>> I was thinking of adding some temporary instrumentation, like >>> just elog'ing whenever the count goes above 1, and seeing where >>> you get hits during the regression tests. I'm prepared to belie