Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-07-16 Thread Lukas Fittl
On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin wrote: > However, we regularly hear from many different customers that they *don't > control queries* sent by application or *can't modify these queries*. > Such kinds of workloads are also not that uncommon and this change makes > it impossibl

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-07-15 Thread Alexander Kukushkin
On Wed, 16 Jul 2025 at 01:39, Michael Paquier wrote: > > > create schema s1; > > create table s1.t as select id from generate_series(1, 10) as id; > > create schema s2; > > create table s1.t as select id from generate_series(1, 100) as id; > > I suspect that you mean s2.t and not s1.t here. >

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-07-15 Thread Michael Paquier
On Tue, Jul 15, 2025 at 04:48:05PM +0200, Alexander Kukushkin wrote: > I totally understand the wish to make pg_stat_statements useful for > workloads that create/drop a ton of temporary tables. > However, when pursuing this goal we impacted other types of totally valid > workloads when tables with

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-07-15 Thread Alexander Kukushkin
Hi, I totally understand the wish to make pg_stat_statements useful for workloads that create/drop a ton of temporary tables. However, when pursuing this goal we impacted other types of totally valid workloads when tables with the same name exist in different schemas. Example: create schema s1; cr

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-04-04 Thread Michael Paquier
On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote: > Are we at the point where the patch is already Ready for Committer? I'll think a bit more about how to document all that. Anyway, yes, I'm OK with the per-field custom_query_jumble, so let's move on with that, so I will do somethin

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Michael Paquier
On Tue, Mar 25, 2025 at 07:56:29PM -0500, Sami Imseih wrote: > FWIW, the pg_stat_statements docs in a few places refer to > queries that may look different but have the same meaning > as “semantically equivalent”, this is why I used the same > terminology here. But, I have no issue with the sim

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Sami Imseih
> If I get the difference right, semantically would apply to concepts > related to linguistics, but that's not what we have here, so you are > using a more general term. FWIW, the pg_stat_statements docs in a few places refer to queries that may look different but have the same meaning as “sem

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Michael Paquier
On Tue, Mar 25, 2025 at 07:24:20PM -0400, Tom Lane wrote: > fails to honor $query_jumble_ignore as the other if-branches do. > Perhaps it's unlikely that a node would have both query_jumble_custom > and query_jumble_ignore annotations, but still, the script would emit > completely incorrect code if

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Tom Lane
Michael Paquier writes: > [ v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patch ] Couple of minor review comments ... * In 5ac462e2b, this bit: # node type - if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/) + if ($query_jumble_custom) + { + #

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Michael Paquier
On Tue, Mar 25, 2025 at 11:58:21AM -0500, Sami Imseih wrote: > "Since the queryid hash value is computed on the post-parse-analysis > representation of the queries, the opposite is also possible: queries with > identical texts might appear as separate entries, if they have different > meanings as a

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Sami Imseih
> > Attached is the second one, with more tests coverage with attribute > > aliases (these being ignored exists in stable branches, but why not > > while on it) and table aliases, and the fixes for the issues pointed > > out by Christoph. I'll double-check all that again tomorrow. Please > > find

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Christoph Berg
Re: Michael Paquier > Attached is the second one, with more tests coverage with attribute > aliases (these being ignored exists in stable branches, but why not > while on it) and table aliases, and the fixes for the issues pointed > out by Christoph. I'll double-check all that again tomorrow. Plea

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Sami Imseih
> In my experience these often not work well with pg_stat_statements today > because > of their own bloat problem, just like with temp tables. You quickly have way > too many > unique entries, and your query text file accumulates a lot of duplicative > entries > (since the same query text gets r

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-25 Thread Michael Paquier
On Mon, Mar 24, 2025 at 11:09:06PM -0700, Lukas Fittl wrote: > For what its worth, +1 on the current proposal in this thread (and doing it > without a GUC), i.e. merging a query that references the same table alias, > ignoring different schemas. Thanks for the feedback. I have looked again at the

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Lukas Fittl
On Mon, Mar 24, 2025 at 8:51 PM Michael Paquier wrote: > On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote: > > > select * from foo s1; > > > select * from foo s2; > > > > ah, thanks for pointing this out. Not as good of a workaround as > > a comment since one must change aliases, but a

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Julien Rouhaud
On Tue, Mar 25, 2025 at 01:17:22AM -0400, Tom Lane wrote: > Julien Rouhaud writes: > > On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote: > >> 2. Tools that are not entitled to set the value of the GUC are forced > >> to be prepared to cope with any setting. That can be anywhere from > >>

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Tom Lane
Julien Rouhaud writes: > On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote: >> 2. Tools that are not entitled to set the value of the GUC are forced >> to be prepared to cope with any setting. That can be anywhere from >> painful to impossible. > Didn't that ship already sailed in pg14 wh

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Julien Rouhaud
Hi, On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote: > > I'm not opining one way or the other on whether squashing an IN list > is desirable. My point is that a GUC is a poor way to make --- or > really, avoid making --- such decisions. The reasons we took away > from previous experimen

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Sami Imseih
> Exactly. Like Tom I'm not really worried about the proposal, but of > course I could prove to be wrong. I am ready to assume that bloat in > pgss entries caused by temp tables is a more common case. I ran installcheck-parallel with the patch and without the patch 3x, and the reduction in pg_s_

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Tom Lane
Sami Imseih writes: >> I don't like that GUC and I would not like this one either. We >> learned years ago that GUCs that change query semantics are a bad >> idea, but apparently now we have hackers who need to relearn that >> lesson the hard way. > query_id_squash_values has a much weaker argum

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Michael Paquier
On Mon, Mar 24, 2025 at 10:30:59PM -0500, Sami Imseih wrote: >> Sami Imseih writes: >>> I agree that some may want stats to merge for the same tables, >>> and others may not. I will suggest this with some hesitation, but why not >>> make this behavior configurable via a GUC? >>> We recently introd

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Michael Paquier
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote: > > select * from foo s1; > > select * from foo s2; > > ah, thanks for pointing this out. Not as good of a workaround as > a comment since one must change aliases, but at least there is > a workaround... Exactly. Like Tom I'm not reall

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Michael Paquier
On Mon, Mar 24, 2025 at 04:41:35PM +0100, Christoph Berg wrote: > Re: Michael Paquier >> +++ b/src/backend/nodes/queryjumblefuncs.c >> @@ -33,6 +33,7 @@ >> #include "postgres.h" >> >> #include "access/transam.h" >> +#include "catalog/namespace.h" > > No longer needed. Indeed. >> +++ b/contri

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Sami Imseih
> Sami Imseih writes: > > I agree that some may want stats to merge for the same tables, > > and others may not. I will suggest this with some hesitation, but why not > > make this behavior configurable via a GUC? > > We recently introduced query_id_squash_values for controlling > > the merge of a

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Tom Lane
Sami Imseih writes: > I agree that some may want stats to merge for the same tables, > and others may not. I will suggest this with some hesitation, but why not > make this behavior configurable via a GUC? > We recently introduced query_id_squash_values for controlling > the merge of an IN list, m

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Sami Imseih
> select * from foo s1; > select * from foo s2; ah, thanks for pointing this out. Not as good of a workaround as a comment since one must change aliases, but at least there is a workaround... > As against this, there is probably also a set of people who would > *like* identical queries on identic

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Tom Lane
Sami Imseih writes: > For example, I have seen users add comments to SQLs to differentiate > similar SQLs coming from different tenants. This patch makes this no longer a > somewhat decent workaround to overcome the fact that pg_stat_statements > does not track schemas or search path. Well, the

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Sami Imseih
> So your idea to use the relation name in eref while skipping the > column list looks kind of promising. Per se the attached. Thoughts? I feel really uneasy about this behavior becoming the default. I can bet there are some users which run common queries across different schemas ( e.g. multi-te

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-24 Thread Christoph Berg
Re: Michael Paquier > So your idea to use the relation name in eref while skipping the > column list looks kind of promising. Per se the attached. Thoughts? Makes sense to me, thanks for digging into it. > +++ b/src/backend/nodes/queryjumblefuncs.c > @@ -33,6 +33,7 @@ > #include "postgres.h" >

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Michael Paquier
On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote: > Right. I'm arguing that that's good. The proposed patch already > obscures the difference between similar table names in different > (temp) schemas, and I'm suggesting that taking that a bit further > would be fine. > > Note that if the

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Michael Paquier
On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote: > I experimented with this trivial fix (shown in-line to keep the cfbot > from thinking this is the patch-of-record): > > What's happening there is that there's an ALTER TABLE ADD COLUMN in > the test, so the executions after the first one

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Tom Lane
Michael Paquier writes: > Alias.aliasname is not qualified, so it means that we'd begin to > assign the same query ID even if using two relations from two schemas > depending on what search_path assigns, no? Right. I'm arguing that that's good. The proposed patch already obscures the difference

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Tom Lane
I wrote: > So my feeling is: if we think this is the behavior we want, let's do > it across the board. I suggest that we simply drop the relid from the > jumble and use the table alias (that is, eref->aliasname) instead. I experimented with this trivial fix (shown in-line to keep the cfbot from t

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Tom Lane
Michael Paquier writes: > On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote: >> Are we at the point where the patch is already Ready for Committer? > I'll think a bit more about how to document all that. Anyway, yes, > I'm OK with the per-field custom_query_jumble, so let's move on

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Christoph Berg
Re: To Michael Paquier > > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > > >> +_jumble##nodetype##_##item(jstate, expr, expr->item) > > > > In this one, I want to mean that we require a custom per-field > > function to look like that: > > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-22 Thread Christoph Berg
Re: Michael Paquier > >> + * Note that the argument types are enforced for the per-field custom > >> + * functions. > >> + */ > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > >> + _jumble##nodetype##_##item(jstate, expr, expr->item) > > In this one, I want to mean that we require a custom per-fiel

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-21 Thread Michael Paquier
On Fri, Mar 21, 2025 at 05:26:20PM +0100, Christoph Berg wrote: > Just one minor thing, I don't understand what you are trying to say in > this comment: > >> +/* >> + * Note that the argument types are enforced for the per-field custom >> + * functions. >> + */ >> +#define JUMBLE_CUSTOM(nodetype,

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-21 Thread Christoph Berg
Re: Michael Paquier > I have been thinking about this patch for a couple of days. What > makes me unhappy in this proposal is that RangeTblEntry is a large and > complicated Node, and we only want to force a custom computation for > the "relid" portion of the node, while being able to also take so

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-20 Thread Michael Paquier
On Wed, Mar 19, 2025 at 01:02:54PM +0100, Christoph Berg wrote: > You are of course right, that one-line comment was just snakeoil :D. > Now there are proper ones, thanks. I have been thinking about this patch for a couple of days. What makes me unhappy in this proposal is that RangeTblEntry is a

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-19 Thread Christoph Berg
Re: Michael Paquier > Fine by me as well to keep a dependency based on the fact that the > structure is rather complicated, but I'd rather document that as well > in parsenodes.h with a slightly fatter comment. What do you think? You are of course right, that one-line comment was just snakeoil :D

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-18 Thread Michael Paquier
On Tue, Mar 18, 2025 at 05:51:54PM +0100, Christoph Berg wrote: > I had thought about it, but figured that integers and strings are > already separate namespaces, so hashing them shouldn't have any > conflicts. But it's more clear to do that, so added in the new > version: > >AppendJumble(

query_id: jumble names of temp tables for better pg_stat_statement UX

2025-03-18 Thread Christoph Berg
Re: Michael Paquier > This is OK on its own, still feels a bit incomplete, as the relid also > includes an assumption about the namespace. I would suggested to add > a hardcoded "pg_temp" here, to keep track of this assumption, at > least. I had thought about it, but figured that integers and str