> On Sat, Feb 25, 2023 at 01:59:04PM +0000, Imseih (AWS), Sami wrote:> > > The overhead of storing this additional private data for the life of the > query > > execution may not be desirable.
> Okay, but why? Additional memory to maintain the JumbleState data in other structs, and the additional copy operations. > This seems like the key point to me here. If we copy more information > into the Query structures, then we basically have no need for sticky > entries, which could be an advantage on its own as it simplifies the > deallocation and lookup logic. Removing the sticky entry logic will be a big plus, and of course we can eliminate a query not showing up properly normalized. > For a DML or a SELECT, the manipulation of the hash table would still > be a three-step process (post-analyze, planner and execution end), but > the first step would have no need to use an exclusive lock on the hash > table because we could just read and copy over the Query the > normalized query if an entry exists, meaning that we could actually > relax things a bit? No lock is held while normalizing, and a shared lock is held while storing, so there is no apparent benefit from that aspect. > This relaxation has as cost the extra memory used > to store more data to allow the insertion to use a proper state of the > Query[Desc] coming from the JumbleState (this extra data has no need > to be JumbleState, just the results we generate from it aka the > normalized query). Wouldn't be less in terms of memory usage to just store the required JumbleState fields in Query[Desc]? clocations_count, highest_extern_param_id, clocations_locations, clocations_length > > In v14, we added a dealloc metric to pg_stat_statements_info, which is > helpful. > > However, this only deals with the pgss_hash entry deallocation. > > I think we should also add a metric for the text file garbage collection. > This sounds like a good idea on its own. I can create a separate patch for this. Regards, -- Sami Imseih Amazon Web services