On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin
wrote:
> However, we regularly hear from many different customers that they *don't
> control queries* sent by application or *can't modify these queries*.
> Such kinds of workloads are also not that uncommon and this change makes
> it impossibl
On Wed, 16 Jul 2025 at 01:39, Michael Paquier wrote:
>
> > create schema s1;
> > create table s1.t as select id from generate_series(1, 10) as id;
> > create schema s2;
> > create table s1.t as select id from generate_series(1, 100) as id;
>
> I suspect that you mean s2.t and not s1.t here.
>
On Tue, Jul 15, 2025 at 04:48:05PM +0200, Alexander Kukushkin wrote:
> I totally understand the wish to make pg_stat_statements useful for
> workloads that create/drop a ton of temporary tables.
> However, when pursuing this goal we impacted other types of totally valid
> workloads when tables with
Hi,
I totally understand the wish to make pg_stat_statements useful for
workloads that create/drop a ton of temporary tables.
However, when pursuing this goal we impacted other types of totally valid
workloads when tables with the same name exist in different schemas.
Example:
create schema s1;
cr
On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote:
> Are we at the point where the patch is already Ready for Committer?
I'll think a bit more about how to document all that. Anyway, yes,
I'm OK with the per-field custom_query_jumble, so let's move on with
that, so I will do somethin
On Tue, Mar 25, 2025 at 07:56:29PM -0500, Sami Imseih wrote:
> FWIW, the pg_stat_statements docs in a few places refer to
> queries that may look different but have the same meaning
> as “semantically equivalent”, this is why I used the same
> terminology here. But, I have no issue with the sim
> If I get the difference right, semantically would apply to concepts
> related to linguistics, but that's not what we have here, so you are
> using a more general term.
FWIW, the pg_stat_statements docs in a few places refer to
queries that may look different but have the same meaning
as “sem
On Tue, Mar 25, 2025 at 07:24:20PM -0400, Tom Lane wrote:
> fails to honor $query_jumble_ignore as the other if-branches do.
> Perhaps it's unlikely that a node would have both query_jumble_custom
> and query_jumble_ignore annotations, but still, the script would emit
> completely incorrect code if
Michael Paquier writes:
> [ v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patch ]
Couple of minor review comments ...
* In 5ac462e2b, this bit:
# node type
- if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
+ if ($query_jumble_custom)
+ {
+ #
On Tue, Mar 25, 2025 at 11:58:21AM -0500, Sami Imseih wrote:
> "Since the queryid hash value is computed on the post-parse-analysis
> representation of the queries, the opposite is also possible: queries with
> identical texts might appear as separate entries, if they have different
> meanings as a
> > Attached is the second one, with more tests coverage with attribute
> > aliases (these being ignored exists in stable branches, but why not
> > while on it) and table aliases, and the fixes for the issues pointed
> > out by Christoph. I'll double-check all that again tomorrow. Please
> > find
Re: Michael Paquier
> Attached is the second one, with more tests coverage with attribute
> aliases (these being ignored exists in stable branches, but why not
> while on it) and table aliases, and the fixes for the issues pointed
> out by Christoph. I'll double-check all that again tomorrow. Plea
> In my experience these often not work well with pg_stat_statements today
> because
> of their own bloat problem, just like with temp tables. You quickly have way
> too many
> unique entries, and your query text file accumulates a lot of duplicative
> entries
> (since the same query text gets r
On Mon, Mar 24, 2025 at 11:09:06PM -0700, Lukas Fittl wrote:
> For what its worth, +1 on the current proposal in this thread (and doing it
> without a GUC), i.e. merging a query that references the same table alias,
> ignoring different schemas.
Thanks for the feedback. I have looked again at the
On Mon, Mar 24, 2025 at 8:51 PM Michael Paquier wrote:
> On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote:
> > > select * from foo s1;
> > > select * from foo s2;
> >
> > ah, thanks for pointing this out. Not as good of a workaround as
> > a comment since one must change aliases, but a
On Tue, Mar 25, 2025 at 01:17:22AM -0400, Tom Lane wrote:
> Julien Rouhaud writes:
> > On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
> >> 2. Tools that are not entitled to set the value of the GUC are forced
> >> to be prepared to cope with any setting. That can be anywhere from
> >>
Julien Rouhaud writes:
> On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
>> 2. Tools that are not entitled to set the value of the GUC are forced
>> to be prepared to cope with any setting. That can be anywhere from
>> painful to impossible.
> Didn't that ship already sailed in pg14 wh
Hi,
On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
>
> I'm not opining one way or the other on whether squashing an IN list
> is desirable. My point is that a GUC is a poor way to make --- or
> really, avoid making --- such decisions. The reasons we took away
> from previous experimen
> Exactly. Like Tom I'm not really worried about the proposal, but of
> course I could prove to be wrong. I am ready to assume that bloat in
> pgss entries caused by temp tables is a more common case.
I ran installcheck-parallel with the patch and without the patch
3x, and the reduction in pg_s_
Sami Imseih writes:
>> I don't like that GUC and I would not like this one either. We
>> learned years ago that GUCs that change query semantics are a bad
>> idea, but apparently now we have hackers who need to relearn that
>> lesson the hard way.
> query_id_squash_values has a much weaker argum
On Mon, Mar 24, 2025 at 10:30:59PM -0500, Sami Imseih wrote:
>> Sami Imseih writes:
>>> I agree that some may want stats to merge for the same tables,
>>> and others may not. I will suggest this with some hesitation, but why not
>>> make this behavior configurable via a GUC?
>>> We recently introd
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote:
> > select * from foo s1;
> > select * from foo s2;
>
> ah, thanks for pointing this out. Not as good of a workaround as
> a comment since one must change aliases, but at least there is
> a workaround...
Exactly. Like Tom I'm not reall
On Mon, Mar 24, 2025 at 04:41:35PM +0100, Christoph Berg wrote:
> Re: Michael Paquier
>> +++ b/src/backend/nodes/queryjumblefuncs.c
>> @@ -33,6 +33,7 @@
>> #include "postgres.h"
>>
>> #include "access/transam.h"
>> +#include "catalog/namespace.h"
>
> No longer needed.
Indeed.
>> +++ b/contri
> Sami Imseih writes:
> > I agree that some may want stats to merge for the same tables,
> > and others may not. I will suggest this with some hesitation, but why not
> > make this behavior configurable via a GUC?
> > We recently introduced query_id_squash_values for controlling
> > the merge of a
Sami Imseih writes:
> I agree that some may want stats to merge for the same tables,
> and others may not. I will suggest this with some hesitation, but why not
> make this behavior configurable via a GUC?
> We recently introduced query_id_squash_values for controlling
> the merge of an IN list, m
> select * from foo s1;
> select * from foo s2;
ah, thanks for pointing this out. Not as good of a workaround as
a comment since one must change aliases, but at least there is
a workaround...
> As against this, there is probably also a set of people who would
> *like* identical queries on identic
Sami Imseih writes:
> For example, I have seen users add comments to SQLs to differentiate
> similar SQLs coming from different tenants. This patch makes this no longer a
> somewhat decent workaround to overcome the fact that pg_stat_statements
> does not track schemas or search path.
Well, the
> So your idea to use the relation name in eref while skipping the
> column list looks kind of promising. Per se the attached. Thoughts?
I feel really uneasy about this behavior becoming the default.
I can bet there are some users which run common queries across
different schemas ( e.g. multi-te
Re: Michael Paquier
> So your idea to use the relation name in eref while skipping the
> column list looks kind of promising. Per se the attached. Thoughts?
Makes sense to me, thanks for digging into it.
> +++ b/src/backend/nodes/queryjumblefuncs.c
> @@ -33,6 +33,7 @@
> #include "postgres.h"
>
On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote:
> Right. I'm arguing that that's good. The proposed patch already
> obscures the difference between similar table names in different
> (temp) schemas, and I'm suggesting that taking that a bit further
> would be fine.
>
> Note that if the
On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote:
> I experimented with this trivial fix (shown in-line to keep the cfbot
> from thinking this is the patch-of-record):
>
> What's happening there is that there's an ALTER TABLE ADD COLUMN in
> the test, so the executions after the first one
Michael Paquier writes:
> Alias.aliasname is not qualified, so it means that we'd begin to
> assign the same query ID even if using two relations from two schemas
> depending on what search_path assigns, no?
Right. I'm arguing that that's good. The proposed patch already
obscures the difference
I wrote:
> So my feeling is: if we think this is the behavior we want, let's do
> it across the board. I suggest that we simply drop the relid from the
> jumble and use the table alias (that is, eref->aliasname) instead.
I experimented with this trivial fix (shown in-line to keep the cfbot
from t
Michael Paquier writes:
> On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote:
>> Are we at the point where the patch is already Ready for Committer?
> I'll think a bit more about how to document all that. Anyway, yes,
> I'm OK with the per-field custom_query_jumble, so let's move on
Re: To Michael Paquier
> > >> +#define JUMBLE_CUSTOM(nodetype, item) \
> > >> +_jumble##nodetype##_##item(jstate, expr, expr->item)
> >
> > In this one, I want to mean that we require a custom per-field
> > function to look like that:
> > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *
Re: Michael Paquier
> >> + * Note that the argument types are enforced for the per-field custom
> >> + * functions.
> >> + */
> >> +#define JUMBLE_CUSTOM(nodetype, item) \
> >> + _jumble##nodetype##_##item(jstate, expr, expr->item)
>
> In this one, I want to mean that we require a custom per-fiel
On Fri, Mar 21, 2025 at 05:26:20PM +0100, Christoph Berg wrote:
> Just one minor thing, I don't understand what you are trying to say in
> this comment:
>
>> +/*
>> + * Note that the argument types are enforced for the per-field custom
>> + * functions.
>> + */
>> +#define JUMBLE_CUSTOM(nodetype,
Re: Michael Paquier
> I have been thinking about this patch for a couple of days. What
> makes me unhappy in this proposal is that RangeTblEntry is a large and
> complicated Node, and we only want to force a custom computation for
> the "relid" portion of the node, while being able to also take so
On Wed, Mar 19, 2025 at 01:02:54PM +0100, Christoph Berg wrote:
> You are of course right, that one-line comment was just snakeoil :D.
> Now there are proper ones, thanks.
I have been thinking about this patch for a couple of days. What
makes me unhappy in this proposal is that RangeTblEntry is a
Re: Michael Paquier
> Fine by me as well to keep a dependency based on the fact that the
> structure is rather complicated, but I'd rather document that as well
> in parsenodes.h with a slightly fatter comment. What do you think?
You are of course right, that one-line comment was just snakeoil :D
On Tue, Mar 18, 2025 at 05:51:54PM +0100, Christoph Berg wrote:
> I had thought about it, but figured that integers and strings are
> already separate namespaces, so hashing them shouldn't have any
> conflicts. But it's more clear to do that, so added in the new
> version:
>
>AppendJumble(
Re: Michael Paquier
> This is OK on its own, still feels a bit incomplete, as the relid also
> includes an assumption about the namespace. I would suggested to add
> a hardcoded "pg_temp" here, to keep track of this assumption, at
> least.
I had thought about it, but figured that integers and str
42 matches
Mail list logo