Re: Query about pg asynchronous processing support
Hi, You should search informations about postgres hooks like thoses used in extension pg_stat_statements https://github.com/postgres/postgres/blob/master/contrib/pg_stat_statements/pg_stat_statements.c And about background capabilities as thoses used in extension pg_background https://github.com/vibhorkum/pg_background There are also extensions working with indexes like hypopg https://github.com/HypoPG/hypopg Good luck Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: WIP: System Versioned Temporal Table
Hello, it seems that Oracle (11R2) doesn't add the Start and End timestamp columns and permit statement like select * from tt union select * from tt AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' SECOND) minus select * from tt VERSIONS BETWEEN TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' second) and SYSTIMESTAMP; Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: pg_stat_statements oddity with track = all
Hi Julien, > The extra field I've proposed would increase the number of records, as it > needs to be a part of the key. To get an increase in the number of records that means that the same statement would appear at top level AND nested level. This seems a corner case with very low (neglectible) occurence rate. Did I miss something ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: pg_stat_statements oddity with track = all
Hi, a crazy idea: - add a parent_statement_id column that would be NULL for top level queries - build statement_id for nested queries based on the merge of: a/ current_statement_id and parent one or b/ current_statement_id and nested level. this would offer the ability to track counters at any depth level ;o) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Hello Konstantin, I remember testing it with pg_stat_statements (and planning counters enabled). Maybe identifying internal queries associated with this (simple) test case, could help dev team ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Is it useful to record whether plans are generic or custom?
Hi Atsushi, +1: Your proposal is a good answer for time based performance analysis (even if parsing durationor blks are not differentiated) . As it makes pgss number of columns wilder, may be an other solution would be to create a pg_stat_statements_xxx view with the same key as pgss (dbid,userid,queryid) and all thoses new counters. And last solution would be to display only generic counters, because in most cases (and per default) executions are custom ones (and generic counters = 0). if not (when generic counters != 0), customs ones could be deducted from total_exec_time - total_generic_time, calls - generic_calls. Good luck for this feature development Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW
oups, sorry so +1 for this fix Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW
Hi, isn't this already fixed in pg14 https://www.postgresql.org/message-id/e1k0mzg-0002vn...@gemulon.postgresql.org ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [PATCH] Add features to pg_stat_statements
Not limited to 3, Like an history table. Will have to think if data is saved at shutdown. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [PATCH] Add features to pg_stat_statements
Hi, Both are probably not needed. I would prefer it accessible in a view live Event | date | victims Eviction... Reset... Part reset ... As there are other needs to track reset times. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [PATCH] Add features to pg_stat_statements
+1 ! An other way is to log evictions, it provides informations about time and amount : for (i = 0; i < nvictims; i++) { hash_search(pgssp_hash, [i]->key, HASH_REMOVE, NULL); } pfree(entries); /* trace when evicting entries, if appening too often this can slow queries ... * increasing pg_stat_sql_plans.max value could help */ ereport(LOG, (errmsg("pg_stat_sql_plans evicting %d entries", nvictims), errhidecontext(true), errhidestmt(true))); -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Is it useful to record whether plans are generic or custom?
I thought it might be preferable to make a GUC to enable or disable this feature, but changing the hash key makes it harder. >> >>> What happens if the server was running with this option enabled and then >>> restarted with the option disabled? Firstly two entries for the same query >>> were stored in pgss because the option was enabled. But when it's disabled >>> and the server is restarted, those two entries should be merged into one >>> at the startup of server? If so, that's problematic because it may take >>> a long time. >> >> What would you think about a third value for this flag to handle disabled >> case (no need to merge entries in this situation) ? > > Sorry I failed to understand your point. You mean that we can have another > flag > to specify whether to merge the entries for the same query at that case or > not? > > If those entries are not merged, what does pg_stat_statements return? > It returns the sum of those entries? Or either generic or custom entry is > returned? The idea is to use a plan_type enum with 'generic','custom','unknown' or 'unset'. if tracking plan_type is disabled, then plan_type='unknown', else plan_type can take 'generic' or 'custom' value. I'm not proposing to merge results for plan_type when disabling or enabling its tracking. >>> Therefore I think that it's better and simple to just expose the number of >>> times generic/custom plan was chosen for each query. >> >> I thought this feature was mainly needed to identifiy "under optimal generic >> plans". Without differentiating execution counters, this will be simpler but >> useless in this case ... isn't it ? > Could you elaborate "under optimal generic plans"? Sorry, I failed to > understand your point.. But maybe you're thinking to use this feature to > check which generic or custom plan is better for each query? > I was just thinking that this feature was useful to detect the case where > the query was executed with unpected plan mode. That is, we can detect > the unexpected case where the query that should be executed with generic > plan is actually executed with custom plan, and vice versa. There are many exemples in pg lists, where users comes saying that sometimes their query takes a (very) longer time than before, without understand why. I some of some exemples, it is that there is a switch from custom to generic after n executions, and it takes a longer time because generic plan is not as good as custom one (I call them under optimal generic plans). If pgss keeps counters aggregated for both plan_types, I don't see how this (under optimal) can be shown. If there is a line in pgss for custom and an other for generic, then it would be easier to compare. Does this makes sence ? Regards PAscal > Regards, > -- > Fujii Masao > Advanced Computing Technology Center > Research and Development Headquarters > NTT DATA CORPORATION
nested queries vs. pg_stat_activity
Hello, An other solution is to expose nested queryid, and to join it with pg_stat_statements. Actual development trying to add queryid to pg_stat_activity isn't helpfull, because it is only exposing top level one. Extension pg_stat_sql_plans (github) propose a function called pg_backend_queryid(pid), that gives the expected queryid (that is stored in shared memory for each backend) ... Regards PAscal
RE: Is it useful to record whether plans are generic or custom?
>> Main purpose is to decide (1) the user interface and (2) the >> way to get the plan type from pg_stat_statements. >> >> (1) the user interface >> I added a new boolean column 'generic_plan' to both >> pg_stat_statements view and the member of the hash key of >> pg_stat_statements. >> >> This is because as Legrand pointed out the feature seems >> useful under the condition of differentiating all the >> counters for a queryid using a generic plan and the one >> using a custom one. > I don't like this because this may double the number of entries in pgss. > Which means that the number of entries can more easily reach > pg_stat_statements.max and some entries will be discarded. Not all the entries will be doubled, only the ones that are prepared. And even if auto prepare was implemented, having 5000, 1 or 2 max entries seems not a problem to me. >> I thought it might be preferable to make a GUC to enable >> or disable this feature, but changing the hash key makes >> it harder. > What happens if the server was running with this option enabled and then > restarted with the option disabled? Firstly two entries for the same query > were stored in pgss because the option was enabled. But when it's disabled > and the server is restarted, those two entries should be merged into one > at the startup of server? If so, that's problematic because it may take > a long time. What would you think about a third value for this flag to handle disabled case (no need to merge entries in this situation) ? > Therefore I think that it's better and simple to just expose the number of > times generic/custom plan was chosen for each query. I thought this feature was mainly needed to identifiy "under optimal generic plans". Without differentiating execution counters, this will be simpler but useless in this case ... isn't it ? > Regards, > -- > Fujii Masao > Advanced Computing Technology Center > Research and Development Headquarters > NTT DATA CORPORATION Regards PAscal
Re: Planning counters in pg_stat_statements (using pgss_store)
>> If we can store the plan for each statement, e.g., like pg_store_plans >> extension [1] does, rather than such partial information, which would >> be enough for your cases? > That'd definitely address way more use cases. Do you know if some > benchmark were done to see how much overhead such an extension adds? Hi Julien, Did you asked about how overhead Auto Explain adds ? The only extension that was proposing to store plans with a decent planid calculation was pg_stat_plans that is not compatible any more with recent pg versions for years. We all know here that pg_store_plans, pg_show_plans, (my) pg_stat_sql_plans use ExplainPrintPlan through Executor Hook, and that Explain is slow ... Explain is slow because it was not designed for performances: 1/ colname_is_unique see https://www.postgresql-archive.org/Re-Explain-is-slow-with-tables-having-many-columns-td6047284.html 2/ hash_create from set_rtable_names Look with perf top about do $$ declare i int; begin for i in 1..100 loop execute 'explain select 1'; end loop end; $$; I may propose a "minimal" explain that only display explain's backbone and is much faster see https://github.com/legrandlegrand/pg_stat_sql_plans/blob/perf-explain/pgssp_explain.c 3/ All those extensions rebuild the explain output even with cached plan queries ... a way to optimize this would be to build a planid during planning (using associated hook) 4/ All thoses extensions try to rebuild the explain plan even for trivial queries/plans like "select 1" or " insert into t values (,,,)" and that's not great for high transactional applications ... So yes, pg_store_plans is one of the short term answers to Andy Fan needs, the answer for the long term would be to help extensions to build planid and store plans, by **adding a planid field in plannedstmt memory structure ** and/or optimizing explain command;o) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Is it useful to record whether plans are generic or custom?
> To track executed plan types, I think execution layer hooks > are appropriate. > These hooks, however, take QueryDesc as a param and it does > not include cached plan information. It seems that the same QueryDesc entry is reused when executing a generic plan. For exemple marking queryDesc->plannedstmt->queryId (outside pg_stat_statements) with a pseudo tag during ExecutorStart reappears in later executions with generic plans ... Is this QueryDesc reusable by a custom plan ? If not maybe a solution could be to add a flag in queryDesc->plannedstmt ? (sorry, I haven't understood yet how and when this generic plan is managed during planning) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Is it useful to record whether plans are generic or custom?
Hello, yes this can be usefull, under the condition of differentiating all the counters for a queryid using a generic plan and the one using a custom one. For me one way to do that is adding a generic_plan column to pg_stat_statements key, someting like: - userid, - dbid, - queryid, - generic_plan I don't know if generic/custom plan types are available during planning and/or execution hooks ... Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Fujii Masao-4 wrote > On 2020/04/03 16:26 > [...] >> >> "Note that planning and execution statistics are updated only at their >> respective end phase, and only for successful operations. >> For example the execution counters of a long running query >> will only be updated at the execution end, without showing any progress >> report before that. > > Probably since this is not the example for explaining the relationship of > planning and execution stats, it's better to explain this separately or > just > drop it? > > What about the attached patch based on your proposals? +1 Your patch is perfect ;^> Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Fujii Masao-4 wrote > On 2020/04/01 18:19, Fujii Masao wrote: > > Finally I pushed the patch! > Many thanks for all involved in this patch! > > As a remaining TODO item, I'm thinking that the document would need to > be improved. For example, previously the query was not stored in pgss > when it failed. But, in v13, if pgss_planning is enabled, such a query is > stored because the planning succeeds. Without the explanation about > that behavior in the document, I'm afraid that users will get confused. > Thought? > > Regards, > > -- > Fujii Masao > Advanced Computing Technology Center > Research and Development Headquarters > NTT DATA CORPORATION Thank you all for this work and especially to Julian for its major contribution ! Regarding the TODO point: Yes I agree that it can be improved. My proposal: "Note that planning and execution statistics are updated only at their respective end phase, and only for successfull operations. For exemple executions counters of a long running SELECT query, will be updated at the execution end, without showing any progress report in the interval. Other exemple, if the statement is successfully planned but fails in the execution phase, only its planning statistics are stored. This may give uncorrelated plans vs calls informations." Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Patch: to pass query string to pg_plan_query()
Tom Lane-2 wrote > Fujii Masao > masao.fujii@.nttdata > writes: >> Does anyone object to this patch? I'm thinking to commit it separetely >> at first before committing the planning_counter_in_pg_stat_statements >> patch. > > I took a quick look through v9-0001-Pass-query-string-to-the-planner.patch > and it's fine by me. It also matches up with something I've wanted to > do for awhile, which is to make the query string available during > planning and execution so that we can produce error cursors for > run-time errors, when relevant. > > [...] > > regards, tom lane Great ! Good news ;o) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Allow auto_explain to log plans before queries are executed
Kyotaro Horiguchi-4 wrote > At Thu, 27 Feb 2020 06:27:24 +0100, Pavel Stehule > pavel.stehule@ > wrote in >> odesílatel Kyotaro Horiguchi > horikyota.ntt@ > >> napsal: > > If we need a live plan dump of a running query, We could do that using > some kind of inter-backend triggering. (I'm not sure if PG offers > inter-backend signalling facility usable by extensions..) > > =# select auto_explain.log_plan_backend(12345); > > postgresql.log: > LOG: requested plan dump: blah, blah.. > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center Did you know https://www.postgresql-archive.org/pg-show-plans-Seeing-all-execution-plans-at-once-td6129231.html ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
> I'm instead attaching a v7 which removes the assert in pg_plan_query, and > modify pgss_planner_hook to also ignore queries without a query text, as > this > seems the best option. Ok, it was the second solution, go on ! -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Planning counters in pg_stat_statements (using pgss_store)
> I don't know it is useful but there are also codes that avoid an error when > sourceText is NULL. > executor_errposition(EState *estate, int location) > { > ... >/* Can't do anything if source text is not available */ >if (estate == NULL || estate->es_sourceText == NULL) > } or maybe would you prefer to replace the Non-Zero queryid test by Non-NULL sourcetext one ? -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Planning counters in pg_stat_statements (using pgss_store)
imai.yoshik...@fujitsu.com wrote > On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote: >> On Thu, Mar 12, 2020 at 1:11 PM Marco Slot > marco.slot@ > wrote: >> > On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud > rjuju123@ > >> wrote: >> > > There's at least the current version of IVM patchset that lacks the >> > > querytext. Looking at various extensions, I see that pg_background >> > > and pglogical call pg_plan_query internally but shouldn't have any >> > > issue providing the query text. But there's also citus extension, >> > > which don't keep around the query string at least when distributing >> > > plans, which makes sense since it's of no use and they're heavily >> > > modifying the original Query. I think that citus folks opinion on >> > > the subject would be useful, so I'm Cc-ing Marco. >> > >> > Most of the time we keep our Query * data structures in a form that >> > can be deparsed back into a query string by a modified copy of >> > ruleutils.c, so we could generate a correct query string if absolutely >> > necessary, though there are performance-sensitive cases where we'd >> > rather not have the deparsing overhead. >> >> Yes, deparsing is probably too expensive for that use case. >> >> > In case of INSERT..SELECT into a distributed table, we might call >> > pg_plan_query on the SELECT part (Query *) and send the output into a >> > DestReceiver that sends tuples into shards of the distributed table >> > via COPY. The fact that SELECT does not show up in pg_stat_statements >> > separately is generally fine because it's an implementation detail, >> > and it would probably be a little confusing because the user never ran >> > the SELECT query. Moreover, the call to pg_plan_query would already be >> > reflected in the planning or execution time of the top-level query, so >> > it would be double counted if it had its own entry. >> >> Well, surprising statements can already appears in pg_stat_statements >> when >> you use some psql features, or if you have triggers as those will run >> additional >> queries under the hood. >> >> The difference here is that since citus is a CustomNode, underlying calls >> to >> planner will be accounted for that node, and that's indeed annoying. I >> can see >> that citus is doing some calls to spi_exec or >> Executor* (in ExecuteLocalTaskPlan), which could also trigger >> pg_stat_statements, but I don't know if a queryid is present there. >> >> > Another case is when some of the shards turn out to be local to the >> > server that handles the distributed query. In that case we plan the >> > queries on those shards via pg_plan_query instead of deparsing and >> > sending the query string to a remote server. It would be less >> > confusing for these queries to show in pg_stat_statements, because the >> > queries on the shards on remote servers will show up as well. However, >> > this is a performance-sensitive code path where we'd rather avoid >> > deparsing. >> >> Agreed. >> >> > In general, I'd prefer if there was no requirement to pass a correct >> > query string. I'm ok with passing "SELECT 'citus_internal'" or just "" >> > if that does not lead to issues. Passing NULL to signal that the >> > planner call should not be tracked separately does seem a bit cleaner. >> >> That's very interesting feedback, thanks! I'm not a fan of giving a way >> for >> queries to say that they want to be ignored by pg_stat_statements, but >> double >> counting the planning counters seem even worse, so I'm +0.5 to accept >> NULL >> query string in the planner, incidentally making pgss ignore those. > > It is preferable that we can count various queries statistics as much as > possible > but if it causes overhead even when without using pg_stat_statements, we > would > not have to force them to set appropriate query_text. > About settings a fixed string in query_text, I think it doesn't make much > sense > because users can't take any actions by seeing those queries' stats. > Moreover, if > we set a fixed string in query_text to avoid pg_stat_statement's errors, > codes > would be inexplicable for other developers who don't know there's such > requirements. > After all, I agree accepting NULL query string in the planner. > > I don't know it is useful but there are also codes that avoid an error > when > sourceText is NULL. > > executor_errposition(EState *estate, int location) > { > ... > /* Can't do anything if source text is not available */ > if (estate == NULL || estate->es_sourceText == NULL) > } > > -- > Yoshikazu Imai Hello Imai, My understanding of V5 patch, that checks for Non-Zero queryid, manage properly case where sourceText is NULL. A NULL sourceText means that there was no Parsing for the associated query, if there was no Parsing, there is no queryid (queryId=0), and no planning counters update. It doesn't change pg_plan_query behaviour (no regression for Citus, IVM, ...), and was tested with success for IVM. If my understanding
Patch: to pass query string to pg_plan_query()
Hello, This is a call for committers, reviewers and users, regarding "planning counters in pg_stat_statements" patch [1] but not only. Historically, this version of pg_stat_statements with planning counters was performing 3 calls to pgss_store() for non utility statements in: 1 - pgss_post_parse_analyze (init entry with queryid and store query text) 2 - pgss_planner_hook (to store planning counters) 3 - pgss_ExecutorEnd (to store execution counters) Then a new version was proposed to remove one call to pgss_store() by adding the query string to the planner pg_plan_query(): 1 - pgss_planner_hook (to store planning counters) 2 - pgss_ExecutorEnd (to store execution counters) Many performances tests where performed concluding that there is no impact on this subject. Patch "to pass query string to the planner", could be committed by itself, and (maybe) used by other extensions. If this was done, this new version of pgss with planning counters could be committed as well, or even later (being used as a non core extension starting with pg13). So please give us your feedback regarding this patch "to pass query string to the planner", if you have other use cases, or any comment regarding core architecture. note: A problem was discovered during IVM testing, because some queries without sql text where planned without being parsed, finishing in pgss with a zero queryid. A work arround is to set track_planning = false, we have chosen to fix that in pgss by ignoring zero queryid inside pgss_planner_hook. Thanks in advance Regards PAscal [1] " https://www.postgresql.org/message-id/20200309103142.GA45401%40nol " -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Never mind ... Please consider PG13 shortest path ;o) My one is parse->queryId != UINT64CONST(0) in pgss_planner_hook(). It fixes IVM problem (verified), and keep CTAS equal to pgss without planning counters (verified too). Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Julien Rouhaud wrote > On Sun, Mar 1, 2020 at 3:55 PM legrand legrand > > legrand_legrand@ > wrote: >> >> >> I like the idea of adding a check for a non-zero queryId in the new >> >> pgss_planner_hook() (zero queryid shouldn't be reserved for >> >> utility_statements ?). >> >> > Some assert hit later, I can say that it's not always true. For >> > instance a CREATE TABLE AS won't run parse analysis for the underlying >> > query, as this has already been done for the original statement, but >> > will still call the planner. I'll change pgss_planner_hook to ignore >> > such cases, as pgss_store would otherwise think that it's a utility >> > statement. That'll probably incidentally fix the IVM incompatibility. >> >> Today with or without test on parse->queryId != UINT64CONST(0), >> CTAS is collected as a utility_statement without planning counter. >> This seems to me respectig the rule, not sure that this needs any >> new (risky) change to the actual (quite stable) patch. > > But the queryid ends up not being computed the same way: > > # select queryid, query, plans, calls from pg_stat_statements where > query like 'create table%'; >queryid | query | plans | calls > -++---+--- > 8275950546884151007 | create table test as select 1; | 1 | 0 > 7546197440584636081 | create table test as select 1 | 0 | 1 > (2 rows) > > That's because CreateTableAsStmt->query doesn't have a query > location/len, as transformTopLevelStmt is only setting that for the > top level Query. That's probably an oversight in ab1f0c82257, but I'm > not sure what's the best way to fix that. Should we pass that > information to all transformXXX function, or let transformTopLevelStmt > handle that. arf, this was not the case in my testing env (that is not up to date) :o( and would not have appeared at all with the proposed test on parse->queryId != UINT64CONST(0) ... -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
>> I like the idea of adding a check for a non-zero queryId in the new >> pgss_planner_hook() (zero queryid shouldn't be reserved for >> utility_statements ?). > Some assert hit later, I can say that it's not always true. For > instance a CREATE TABLE AS won't run parse analysis for the underlying > query, as this has already been done for the original statement, but > will still call the planner. I'll change pgss_planner_hook to ignore > such cases, as pgss_store would otherwise think that it's a utility > statement. That'll probably incidentally fix the IVM incompatibility. Today with or without test on parse->queryId != UINT64CONST(0), CTAS is collected as a utility_statement without planning counter. This seems to me respectig the rule, not sure that this needs any new (risky) change to the actual (quite stable) patch. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Hi Julien, >> But I would have prefered this new feature to work the same way with or >> without track_planning activated ;o( > Definitely, but fixing the issue in pgss (ignoring planner calls when > we don't have a query text) means that pgss won't give an exhaustive > view of activity anymore, so a fix in IVM would be a better solution. > Let's wait and see if Nagata-san and other people involved in that > have an opinion on it. It seems IVM team does not consider this point as a priority ... We should not wait for them, if we want to keep a chance to be included in PG13. So we have to make this feature more robust, an assert failure being considered as a severe regression (even if this is not coming from pgss). I like the idea of adding a check for a non-zero queryId in the new pgss_planner_hook() (zero queryid shouldn't be reserved for utility_statements ?). Fixing the corner case where a query (with no sql text) can be planned without being parsed is an other subject that should be resolved in an other thread. This kind of query was ignored in pgss, it should be ignored in pgss with planning counters. Any thoughts ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
>> thank you for patch v14, that fix problems inherited from temporary tables. >> it seems that this ASSERT problem with pgss patch is still present ;o( >> > > Sorry but we are busy on fixing and improving IVM patches. I think fixing > the assertion failure needs non trivial changes to other part of > PosthreSQL. > So we would like to work on the issue you reported after the pgss patch > gets committed. Imagine it will happen tomorrow ! You may say I'm a dreamer But I'm not the only one ... ... -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
> I have tried to use an other patch with yours: > "Planning counters in pg_stat_statements (using pgss_store)" > setting > shared_preload_libraries='pg_stat_statements' > pg_stat_statements.track=all > restarting the cluster and creating the extension > When trying following syntax: > create table b1 (id integer, x numeric(10,3)); > create incremental materialized view mv1 as select id, count(*),sum(x) > from b1 group by id; > insert into b1 values (1,1) > > I got an ASSERT FAILURE in pg_stat_statements.c > on > Assert(query != NULL); > > comming from matview.c > refresh_matview_datafill(dest_old, query, queryEnv, NULL); > or > refresh_matview_datafill(dest_new, query, queryEnv, NULL); > > If this (last) NULL field was replaced by the query text, > a comment or just "n/a", > it would fix the problem. > Could this be investigated ? Hello, thank you for patch v14, that fix problems inherited from temporary tables. it seems that this ASSERT problem with pgss patch is still present ;o( Could we have a look ? Thanks in advance Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Allow auto_explain to log plans before queries are executed
Hi, that feature for dumping plans with auto explain is already available in https://github.com/legrandlegrand/pg_stat_sql_plans This is an hybrid extension combining auto_explain and pg_stat_statements, adding a planid and tracking metrics even on error, ..., ... With pg_stat_sql_plans.track_planid = true pg_stat_sql_plans.explain = true --> it writes explain plan in log file after planning and only one time per (queryid,planid) then no need of sampling and with pg_stat_sql_plans.track = 'all' --> function pgssp_backend_queryid(pid) retrieves (nested) queryid of a stuck statement, and permit to retrieve its plan (by its queryid) in logs. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Using stat collector for collecting long SQL
Hi > There is a often problem with taking source of long SQL strings. The > pg_stat_activity field query is reduced to some short limit and is not too > practical to increase this limit. I thought it was "old story", since that track_activity_query_size can be increased widely: https://www.postgresql.org/message-id/flat/7b5ecc5a9991045e2f13c84e3047541d%40postgrespro.ru [...] > It can be base for implementation EXPLAIN PID ? +1 for this concept, that would be very usefull for diagnosing stuck queries. Until now the only proposal I saw regarding this was detailled in https://www.postgresql.org/message-id/1582756552256-0.p...@n3.nabble.com a prerequisite to be able to use extension postgrespro/pg_query_state Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: PoC: custom signal handler for extensions
Hello Maksim, reading about https://www.postgresql-archive.org/Allow-auto-explain-to-log-plans-before-queries-are-executed-td6124646.html makes me think (again) about your work and pg_query_state ... Is there a chance to see you restarting working on this patch ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: WIP: Aggregation push-down
Antonin Houska-2 wrote > Alvaro Herrera > alvherre@ > wrote: > >> This stuff seems very useful. How come it sits unreviewed for so long? > > I think the review is hard for people who are not interested in the > planner > very much. And as for further development, there are a few design > decisions > that can hardly be resolved without Tom Lane's comments. Right now I > recall > two problems: 1) is the way I currently store RelOptInfo for the grouped > relations correct?, 2) how should we handle types for which logical > equality > does not imply physical (byte-wise) equality? > > Fortunately it seems now that I'm not the only one who cares about 2), so > this > problem might be resolved soon: > > https://www.postgresql.org/message-id/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ%40mail.gmail.com > > But 1) still remains. > > -- > Antonin Houska > Web: https://www.cybertec-postgresql.com Hello would "pgsql: Add equalimage B-Tree support functions." https://www.postgresql.org/message-id/e1j72ny-0002gi...@gemulon.postgresql.org help for 2) ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Yugo Nagata wrote > Thank you for your suggestion! I agree that the feature to switch between > normal materialized view and incrementally maintainable view is useful. > We will add this to our ToDo list. Regarding its syntax, > I would not like to add new keyword like NONINCREMENTAL, so how about > the following > > ALTER MATERIALIZED VIEW ... SET {WITH | WITHOUT} INCREMENTAL REFRESH > > although this is just a idea and we will need discussion on it. Thanks I will follow that discussion on GitHub https://github.com/sraoss/pgsql-ivm/issues/79 Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Takuma Hoshiai wrote > Hi, > > Attached is the latest patch (v12) to add support for Incremental > Materialized View Maintenance (IVM). > It is possible to apply to current latest master branch. > > Differences from the previous patch (v11) include: > * support executing REFRESH MATERIALIZED VIEW command with IVM. > * support unscannable state by WITH NO DATA option. > * add a check for LIMIT/OFFSET at creating an IMMV > > If REFRESH is executed for IMMV (incremental maintainable materialized > view), its contents is re-calculated as same as usual materialized views > (full REFRESH). Although IMMV is basically keeping up-to-date data, > rounding errors can be accumulated in aggregated value in some cases, for > example, if the view contains sum/avg on float type columns. Running > REFRESH command on IMMV will resolve this. Also, WITH NO DATA option > allows to make IMMV unscannable. At that time, IVM triggers are dropped > from IMMV because these become unneeded and useless. > > [...] Hello, regarding syntax REFRESH MATERIALIZED VIEW x WITH NO DATA I understand that triggers are removed from the source tables, transforming the INCREMENTAL MATERIALIZED VIEW into a(n unscannable) MATERIALIZED VIEW. postgres=# refresh materialized view imv with no data; REFRESH MATERIALIZED VIEW postgres=# select * from imv; ERROR: materialized view "imv" has not been populated HINT: Use the REFRESH MATERIALIZED VIEW command. This operation seems to me more of an ALTER command than a REFRESH ONE. Wouldn't the syntax ALTER MATERIALIZED VIEW [ IF EXISTS ] name SET WITH NO DATA or SET WITHOUT DATA be better ? Continuing into this direction, did you ever think about an other feature like: ALTER MATERIALIZED VIEW [ IF EXISTS ] name SET { NOINCREMENTAL } or even SET { NOINCREMENTAL | INCREMENTAL | INCREMENTAL CONCURRENTLY } that would permit to switch between those modes and would keep frozen data available in the materialized view during heavy operations on source tables ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: POC: rational number type (fractions)
Hello, It seems you are not the first to be interested in such feature. There was a similar extension used in "incremental view maintenance" testing: https://github.com/nuko-yokohama/pg_fraction didn't tryed it myself. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: WIP: Aggregation push-down
Hello, Thank you for this great feature ! I hope this will be reviewed/validated soon ;o) Just a comment: set enable_agg_pushdown to true; isn't displayed in EXPLAIN (SETTINGS) syntax. The following modification seems to fix that: src/backend/utils/misc/guc.c {"enable_agg_pushdown", PGC_USERSET, QUERY_TUNING_METHOD, gettext_noop("Enables aggregation push-down."), NULL, GUC_EXPLAIN --- line Added --- }, _agg_pushdown, false, NULL, NULL, NULL }, then postgres=# set enable_agg_pushdown = true; SET postgres=# explain (settings) select 1; QUERY PLAN -- Result (cost=0.00..0.01 rows=1 width=4) Settings: enable_agg_pushdown = 'on' (2 rows) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Hi Julien, bot is still unhappy https://travis-ci.org/postgresql-cfbot/postgresql/builds/638701399 portalcmds.c: In function ‘PerformCursorOpen’: portalcmds.c:93:7: error: ‘queryString’ may be used uninitialized in this function [-Werror=maybe-uninitialized] plan = pg_plan_query(query, queryString, cstmt->options, params); ^ portalcmds.c:50:8: note: ‘queryString’ was declared here char*queryString; ^ cc1: all warnings being treated as errors : recipe for target 'portalcmds.o' failed make[3]: *** [portalcmds.o] Error 1 make[3]: Leaving directory '/home/travis/build/postgresql-cfbot/postgresql/src/backend/commands' common.mk:39: recipe for target 'commands-recursive' failed make[2]: *** [commands-recursive] Error 2 make[2]: *** Waiting for unfinished jobs regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: postgresql-13devel initDB Running in debug mode.
Tom Lane-2 wrote > legrand legrand > legrand_legrand@ > writes: >> after building devel snapshot from 2020-01-17 with msys, >> initDB generates a lot of additional informations when launched: >> [ debug output snipped ] >> Is that the expected behavior, or just a temporary test ? > > It'd be the expected behavior if you'd given a -d switch to initdb. > > regards, tom lane so yes, its me : -d directory inspite of -D directory and the doc is perfect: -d --debug Print debugging output from the bootstrap backend and a few other messages of lesser interest for the general public. The bootstrap backend is the program initdb uses to create the catalog tables. This option generates a tremendous amount of extremely boring output. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
postgresql-13devel initDB Running in debug mode.
Hello, after building devel snapshot from 2020-01-17 with msys, initDB generates a lot of additional informations when launched: VERSION=13devel PGDATA=../data share_path=C:/msys64/usr/local/pgsql/share PGPATH=C:/msys64/usr/local/pgsql/bin POSTGRES_SUPERUSERNAME=lemoyp POSTGRES_BKI=C:/msys64/usr/local/pgsql/share/postgres.bki POSTGRES_DESCR=C:/msys64/usr/local/pgsql/share/postgres.description POSTGRES_SHDESCR=C:/msys64/usr/local/pgsql/share/postgres.shdescription POSTGRESQL_CONF_SAMPLE=C:/msys64/usr/local/pgsql/share/postgresql.conf.sample PG_HBA_SAMPLE=C:/msys64/usr/local/pgsql/share/pg_hba.conf.sample PG_IDENT_SAMPLE=C:/msys64/usr/local/pgsql/share/pg_ident.conf.sample 2020-01-18 00:19:37.407 CET [10152] DEBUG: invoking IpcMemoryCreate(size=149061632) 2020-01-18 00:19:37.408 CET [10152] DEBUG: could not enable Lock Pages in Memory user right 2020-01-18 00:19:37.408 CET [10152] HINT: Assign Lock Pages in Memory user right to the Windows user account which runs PostgreSQL. 2020-01-18 00:19:37.408 CET [10152] DEBUG: disabling huge pages 2020-01-18 00:19:37.414 CET [10152] DEBUG: SlruScanDirectory invoking callback on pg_notify/ 2020-01-18 00:19:37.414 CET [10152] DEBUG: removing file "pg_notify/" 2020-01-18 00:19:37.416 CET [10152] DEBUG: dynamic shared memory system will support 308 segments 2020-01-18 00:19:37.416 CET [10152] DEBUG: created dynamic shared memory control segment 718036776 (7408 bytes) 2020-01-18 00:19:37.416 CET [10152] DEBUG: transaction ID wrap limit is 2147483650, limited by database with OID 1 2020-01-18 00:19:37.416 CET [10152] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1 2020-01-18 00:19:37.416 CET [10152] DEBUG: creating and filling new WAL file 2020-01-18 00:19:37.446 CET [10152] DEBUG: could not remove file "pg_wal/00010001": No such file or directory 2020-01-18 00:19:37.454 CET [10152] DEBUG: mapped win32 error code 5 to 13 2020-01-18 00:19:37.455 CET [10152] DEBUG: done creating and filling new WAL file 2020-01-18 00:19:37.467 CET [10152] DEBUG: InitPostgres 2020-01-18 00:19:37.467 CET [10152] DEBUG: my backend ID is 1 2020-01-18 00:19:37.467 CET [10152] NOTICE: database system was shut down at 2020-01-18 00:19:37 CET 2020-01-18 00:19:37.467 CET [10152] DEBUG: mapped win32 error code 2 to 2 2020-01-18 00:19:37.471 CET [10152] DEBUG: checkpoint record is at 0/128 2020-01-18 00:19:37.471 CET [10152] DEBUG: redo record is at 0/128; shutdown true ... Is that the expected behavior, or just a temporary test ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: pg13 PGDLLIMPORT list
Michael Paquier-2 wrote > On Fri, Jan 17, 2020 at 03:07:48PM -0700, legrand legrand wrote: > [...] > > No objections from me to add both to what's imported. Do you have a > specific use-case in mind for an extension on Windows? Just > wondering.. > -- > Michael > > signature.asc (849 bytes) > https://www.postgresql-archive.org/attachment/6119761/0/signature.asc; No specific use-case, just the need to test features (IVM, push agg to base relations and joins, ...) on a professional laptop in windows 10 for a nomad app that collects performance metrics on Oracle databases. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
pg13 PGDLLIMPORT list
Hello, would it be possible to add PGDLLIMPORT to permit to build following extensions on windows pg_stat_sql_plans: src/include/pgstat.h extern PGDLLIMPORT bool pgstat_track_activities; pg_background: src/include/storage/proc.h extern PGDLLIMPORT int StatementTimeout; Thanks in advance Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Hello, It seems that patch v11 doesn't apply any more. Problem with "scanRTEForColumn" maybe because of change: https://git.postgresql.org/pg/commitdiff/b541e9accb28c90656388a3f827ca3a68dd2a308 Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: WIP: System Versioned Temporal Table
Vik Fearing-4 wrote > On 05/01/2020 16:01, legrand legrand wrote: > > > No, that is incorrect. The standard syntax is: > > > FROM tablename FOR SYSTEM_TIME AS OF '...' > > FROM tablename FOR SYSTEM_TIME BETWEEN '...' AND '...' > > FROM tablename FOR SYSTEM_TIME FROM '...' TO '...' > > -- > > Vik Fearing oups, I re-read links and docs and I'm wrong. Sorry for the noise Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Julien Rouhaud wrote > On Sun, Jan 5, 2020 at 4:11 PM legrand legrand > > legrand_legrand@ > wrote: >> >> Hi Julien, >> >> I would like to create a link with >> https://www.postgresql.org/message-id/ > 1577490124579-0.post@.nabble >> >> where we met an ASSET FAILURE because query text was not initialized ... >> >> The question raised is: >> >> - should query text be always provided >> or >> - if not how to deal that case (in pgss). > > I'd think that since the query text was until now always provided, > there's no reason why this patch should change that. That being said, > there has been other concerns raised wrt. temporary tables in the IVM > patchset, so ISTM that there might be important architectural changes > upcoming, so having to deal with this case in pgss is not rushed > (especially since handling that in pgss would be trivial), and can > help to catch issue with the query text pasing. IVM revealed that ASSERT, but IVM works fine with pg_stat_statements.track_planning = off. There may be others parts of postgresql that would have workede fine as well. This means 2 things: - there is a (litle) risk to meet other assert failures when using planning counters in pgss, - we have an easy workarround to fix it (disabling track_planning). But I would have prefered this new feature to work the same way with or without track_planning activated ;o( -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Hi Julien, I would like to create a link with https://www.postgresql.org/message-id/1577490124579-0.p...@n3.nabble.com where we met an ASSET FAILURE because query text was not initialized ... The question raised is: - should query text be always provided or - if not how to deal that case (in pgss). Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: WIP: System Versioned Temporal Table
Vik Fearing-4 wrote > On 05/01/2020 11:16, Surafel Temesgen wrote: >> >> >> On Fri, Jan 3, 2020 at 4:22 PM Vik Fearing >> > vik.fearing@ > mailto: > vik.fearing@ > > wrote: >> > > [...] > > You only test FROM-TO and with a really wide interval. There are no > tests for AS OF and no tests for BETWEEN-AND. > > > As for the syntax, you have: > > > select a from for stest0 system_time from '2000-01-01 00:00:00.0' to > 'infinity' ORDER BY a; > > > when you should have: > > > select a from stest0 for system_time from '2000-01-01 00:00:00.0' to > 'infinity' ORDER BY a; > > > That is, the FOR should be on the other side of the table name. > > [...] > > Vik Fearing Hello, I though that standard syntax was "AS OF SYSTEM TIME" as discussed here https://www.postgresql.org/message-id/flat/A254CDC3-D308-4822-8928-8CC584E0CC71%40elusive.cx#06c5dbffd5cfb9a20cdeec7a54dc657f , also explaining how to parse such a syntax . Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Incremental View Maintenance: ERROR: out of shared memory
Hello here is an unexpected error found while testing IVM v11 patches create table b1 (id integer, x numeric(10,3)); create incremental materialized view mv1 as select id, count(*),sum(x) from b1 group by id; do $$ declare i integer; begin for i in 1..1 loop insert into b1 values (1,1); end loop; end; $$ ; ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. CONTEXT: SQL statement "DROP TABLE pg_temp_3.pg_temp_66154" SQL statement "insert into b1 values (1,1)" PL/pgSQL function inline_code_block line 1 at SQL statement Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Hello, Thank you for this patch. I have tried to use an other patch with yours: "Planning counters in pg_stat_statements (using pgss_store)" https://www.postgresql.org/message-id/CAOBaU_Y12bn0tOdN9RMBZn29bfYYH11b2CwKO1RO7dX9fQ3aZA%40mail.gmail.com setting shared_preload_libraries='pg_stat_statements' pg_stat_statements.track=all and creating the extension When trying following syntax: create table b1 (id integer, x numeric(10,3)); create incremental materialized view mv1 as select id, count(*),sum(x) from b1 group by id; insert into b1 values (1,1) I got an ASSERT FAILURE in pg_stat_statements.c on Assert(query != NULL); comming from matview.c refresh_matview_datafill(dest_old, query, queryEnv, NULL); or refresh_matview_datafill(dest_new, query, queryEnv, NULL); If this (last) NULL field was replaced by the query text, a comment or just "n/a", it would fix the problem. Could this be investigated ? Thanks in advance Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Yugo Nagata wrote > On Mon, 23 Dec 2019 03:41:18 -0700 (MST) > legrand legrand > legrand_legrand@ > wrote: > > [ ...] > >> I would even >> prefer a common "table" shared between all sessions like GLOBAL TEMPORARY >> TABLE (or something similar) as described here: >> https://www.postgresql.org/message-id/flat/157703426606.1198.2452090605041230054.pgcf%40coridan.postgresql.org#331e8344bbae904350af161fb43a0aa6 > > Although I have not looked into this thread, this may be help if this is > implemented. However, it would be still necessary to truncate the table > before the view maintenance because such tables always exist and can be > accessed and modified by any users. > > -- > Yugo Nagata > nagata@.co > For information, in this table data is PRIVATE to each session, can be purged on the ON COMMIT event and disappear at SESSION end. Yes, this feature could be utile only if it's implemented. And you are rigth some data has to be deleted on the ON STATEMENT event (not sure if TRUNCATE is Global or Session specific in this situation). -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Hello, regarding my initial post: > For each insert into a base table there are 3 statements: > - ANALYZE pg_temp_3.pg_temp_81976 > - WITH updt AS ( UPDATE public.mv1 AS mv SET __ivm_count__ = ... > - DROP TABLE pg_temp_3.pg_temp_81976 For me there where 3 points to discuss: - create/drop tables may bloat dictionnary tables - create/drop tables prevents "WITH updt ..." from being shared (with some plan caching) - generates many lines in pg_stat_statements In fact I like the idea of a table created per session, but I would even prefer a common "table" shared between all sessions like GLOBAL TEMPORARY TABLE (or something similar) as described here: https://www.postgresql.org/message-id/flat/157703426606.1198.2452090605041230054.pgcf%40coridan.postgresql.org#331e8344bbae904350af161fb43a0aa6 That would remove the drop/create issue, permits to reduce planning time for "WITH updt ..." statements (as done today in PLpgsql triggers), and would fix the pgss "bloat" issue. Like that the "cost" of the immediate refresh approach would be easier to support ;o) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
Hello, First of all many thanks for this Great feature replacing so many triggers by a so simple syntax ;o) I was wondering about performances and add a look at pg_stat_statements (with track=all) with IVM_v9.patch. For each insert into a base table there are 3 statements: - ANALYZE pg_temp_3.pg_temp_81976 - WITH updt AS ( UPDATE public.mv1 AS mv SET __ivm_count__ = ... - DROP TABLE pg_temp_3.pg_temp_81976 It generates a lot of lines in pg_stat_statements with calls = 1. Thoses statements can not be shared because the temp table is dropped each time. Is there a plan to change this ? Many Thanks again Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Columns correlation and adaptive query optimization
Hello Konstantin, What you have proposed regarding join_selectivity and multicolumn statistics is a very good new ! Regarding your auto_explain modification, maybe an "advisor" mode would also be helpfull (with auto_explain_add_statistics_threshold=-1 for exemple). This would allow to track which missing statistic should be tested (manually or in an other environment). In my point of view this advice should be an option of the EXPLAIN command, that should also permit auto_explain module to propose "learning" phase. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: How to install login_hook in Postgres 10.5
pavan95 wrote > Hello Community, > > While I was searching for logon trigger in postgres similar to that of > Oracle, I came across "login_hook", which can be installed as a Postgres > database extension to mimic a logon trigger. > > But I tried to install but failed. Error is that it could not find its .so > file. Could you please help me in installing this login_hook ?? > > Looking forward to hear from you. Any suggestions would be of great help! > > Thanks & Regards, > Pavan > > > > -- > Sent from: > http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html issue ERROR: could not access file "login_hook.so": No such file or directory has been fixed see: https://github.com/splendiddata/login_hook/issues/1 Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Hooks for session start and end, take two
Michael Paquier-2 wrote > On Thu, Sep 26, 2019 at 09:57:57AM -0700, legrand legrand wrote: >> Does that mean that all processes seen in pg_stat_activity like >> - autovacuum launcher >> - logical replication launcher >> - background writer >> - checkpointer >> - walwriter >> ... >> - Parallel worker >> are available with that hook (it seems not) ? > > All processes using PostgresMain() for their startup take this code > path like WAL senders and normal backend sessions, but not things > going through StartChildProcess() (WAL receiver, bgwriter, etc.) or > other processes like autovacuum processes which use a different start > code path. OK I confirm: - "client backend" appears at session start and end hook, - "autovacuum worker" and "pg_background" only appears at session end hook (backend_start can be retreived from pg_stat_activity), - "parallel workers" are not visible at all (because extension cannot assign XIDs during a parallel operation) All seems fine to me. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Hooks for session start and end, take two
Hello, Thank you for the work done on this subject. After starting to play with it, I have a question and a remark: > - previous hook calls were only called for normal backends, which was > incorrect as we define the backend so as we apply no backend-related > filtering for the hook. Does that mean that all processes seen in pg_stat_activity like - autovacuum launcher - logical replication launcher - background writer - checkpointer - walwriter ... - Parallel worker are available with that hook (it seems not) ? The patch refers to a 2017 copyright, that's all I found yet ;o) Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Proposal: Better query optimization for "NOT IN" clause
Hello, Just for information there are some works regarding how to include this in core, that may interest you ;o) see "NOT IN subquery optimization" https://www.postgresql.org/message-id/flat/1550706289606-0.post%40n3.nabble.com commitfest entry: https://commitfest.postgresql.org/24/2023/ and "Converting NOT IN to anti-joins during planning" https://www.postgresql.org/message-id/flat/CAKJS1f82pqjqe3WT9_xREmXyG20aOkHc-XqkKZG_yMA7JVJ3Tw%40mail.gmail.com commitfest entry: https://commitfest.postgresql.org/24/2020/ Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
my understanding is * pg_stat_statements.track = 'none' or 'top' (default) or 'all' to make queryId optionally computed * a new GUC: pg_stat_statements.queryid_based = 'oids' (default) or 'names' or 'fullnames' to choose the queryid computation algorithm am I rigth ? -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
Hi Jim, Its never too later, as nothing has been concluded about that survey ;o) For information, I thought It would be possible to get a more stable QueryId, by hashing relation name or fully qualified names. With the support of Julien Rouhaud, I tested with this kind of code: case RTE_RELATION: if (pgss_queryid_oid) { APP_JUMB(rte->relid); } else { rel = RelationIdGetRelation(rte->relid); APP_JUMB_STRING(RelationGetRelationName(rel)); APP_JUMB_STRING(get_namespace_name(get_rel_namespace(rte->relid))); RelationClose(rel); { thinking that 3 hash options would be interesting in pgss: 1- actual OID 2- relation names only (for databases WITHOUT distinct schemas contaning same tables) 3- fully qualified names schema.relname (for databases WITH distinct schemas contaning same tables) but performances where quite bad (it was a few month ago, but I remenber about a 1-5% decrease). I also remenber that's this was not portable between distinct pg versions 11/12 and also not sure it was stable between windows / linux ports ... So I stopped here ... Maybe its time to test deeper this alternative (to get fully qualified names hashes in One call) knowing that such transformations will have to be done for all objects types (not only relations) ? I'm ready to continue testing as it seems the less impacting solution to keep actual pgss ... If this doesn't work, then trying with a normalized query text (associated with search_path) would be the other alternative, but impacts on actual pgss would be higher ... Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Kyotaro Horiguchi-4 wrote > At Sun, 4 Aug 2019 00:04:01 -0700 (MST), legrand legrand > legrand_legrand@ > wrote in < > 1564902241482-0.post@.nabble >> >> > However having the nested queryid in >> > pg_stat_activity would be convenient to track >> > what is a long stored functions currently doing. >> >> +1 >> >> And this could permit to get wait event sampling per queryid when >> pg_stat_statements.track = all > > I'm strongly on this side emotionally, but also I'm on Tom and > Andres's side that exposing querid that way is not the right > thing. > > Doing that means we don't need exact correspondence between > top-level query and queryId (in nested or multistatement queries) > in this patch. pg_stat_statements will allow us to do the same > thing by having additional uint64[MaxBackends] array in > pgssSharedState, instead of expanding PgBackendStatus array in > core by the same size. > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center Hi Kyotaro, Thank you for this answer. What you propose here is already available Inside pg_stat_sql_plans extension (a derivative from Pg_stat_statements and pg_store_plans) And I’m used to this queryid behavior with top Level Queries... My emotion was high but I will accept it ! Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
> However having the nested queryid in > pg_stat_activity would be convenient to track > what is a long stored functions currently doing. +1 And this could permit to get wait event sampling per queryid when pg_stat_statements.track = all Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: How to install login_hook in Postgres 10.5
Hello, shouldn't we update associated commitfest entry https://commitfest.postgresql.org/15/1318/ to give it a chance to be included in pg13 ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Adaptive query optimization
>>I tried to create much simpler version of AQO based on auto_explain >>extension. >>This extension provide all necessary infrastructure to analyze >>statements with long execution time. >>I have added two new modes to auto_explain: >>1. Auto generation of multicolumn statistics for variables using in >>clauses with large estimation error. >Interesting! I probably wouldn't consider this part of adaptive query >optimization, but it probably makes sense to make it part of this. I >wonder if we might improve this to also suggest "missing" indexes? Shouldn't this be extended to adjust the default_statistics_target configuration variable, or on a column-by-column basis by setting the per-column statistics target with ALTER TABLE ... ALTER COLUMN ... SET STATISTICS ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Adaptive query optimization
Hello, this seems very interesting and make me think about 2 other projets: - https://github.com/trustly/pg_badplan - https://github.com/ossc-db/pg_plan_advsr As I understand all this, there are actually 3 steps: - compare actual / estimated rows - suggests some statistics gathering modification - store optimised plans (or optimized stats) for reuse I really like the "advisor Idea", permitting to identify where statistics are wrong and how to fix them with ANALYZE command. Is there a chance that some futur Optimizer / statistics improvements make thoses "statistics advices" enough to get good plans (without having to store live stats or optimized plan) ? Thanks in advance Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: How to install login_hook in Postgres 10.5
Hello, This extension https://github.com/splendiddata/login_hook seems very interesting ! But I didn't test it myself and maybe the best place to ask support is there https://github.com/splendiddata/login_hook/issues For information there is something equivalent in core "[PATCH] A hook for session start" https://www.postgresql.org/message-id/flat/20171103164305.1f952c0f%40asp437-24-g082ur#d897fd6ec551d242493815312de5f5d5 that finished commited "pgsql: Add hooks for session start and session end" https://www.postgresql.org/message-id/flat/575d6fa2-78d0-4456-8600-302fc35b2591%40dunslane.net#0819e315c6e44c49a36c69080cab644d but was finally rollbacked because it didn't pass installcheck test ... Maybe you are able to patch your pg installation, it would be a solution of choice (there is a nice exemple of extension included) Showing interest for this may also help getting this feature back ;o) Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Re: Logging the feature of SQL-level read/write commits
Hi, good point ! wal2Json seems to correspond to your needs, this is first designed for Change Data Capture, taht could generate a (very) big size of logs. You didn't tell us much about your use case ... and maybe, if the number of data modifications is not too big, and the number of tables to track limited, an history_table fed with (after insert, after update, before delete) triggers an other (simpler) solution ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Logging the feature of SQL-level read/write commits
Hello, may be you can find more informations regarding WAL concepts in Write Ahead Logging — WAL http://www.interdb.jp/pg/pgsql09.html It seems very complicated to change WAL format ... Maybe there are other solutions to answer your need, I found many interesting solutions in postgres archives searching with words "flashback", "timetravel", "tablelog", ... My prefered is "AS OF queries" https://www.postgresql.org/message-id/flat/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru and more specificaly syntax VERSIONS BETWEEN SYSTEM TIME ... AND ... that would permit to get history of modifications for some rows and help fixing current data Regards PAscal Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: minimizing pg_stat_statements performance overhead
CF entry created https://commitfest.postgresql.org/23/2092/ Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: minimizing pg_stat_statements performance overhead
Robert Haas wrote > On Tue, Apr 2, 2019 at 5:37 AM Christoph Berg > myon@ > wrote: >> Re: Raymond Martin 2019-04-01 > BN8PR21MB121708579A3782866DF1F745B1550@.outlook > >> > Thanks again Fabien. I am attaching the patch to this email in the hope >> of getting it approved during the next commit fest. >> >> you sent the patch as UTF-16, could you re-send it as plain ascii? > > One thing that needs some thought here is what happens if the value of > pgss_enabled() changes. For example we don't want a situation where > if the value changes from off to on between one stage of processing > and another, the server crashes. > > I don't know whether that's a risk here or not; it's just something to > think about. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company Hi, here is a simple test where I commented that line in pgss_post_parse_analyze to force return; (as if pgss_enabled() was disabled) but kept pgss_enabled() every where else /* Safety check... */ // if (!pgss || !pgss_hash || !pgss_enabled()) return; This works without crash as you can see here after: postgres=# select pg_stat_statements_reset(); pg_stat_statements_reset -- (1 row) postgres=# show pg_stat_statements.track; pg_stat_statements.track -- all (1 row) postgres=# create table a(id int); CREATE TABLE postgres=# select * from a where id=1; id (0 rows) postgres=# select queryid,query,calls from pg_stat_statements; queryid | query | calls -+---+--- 1033669194118974675 | show pg_stat_statements.track | 1 3022461129400094363 | create table a(id int)| 1 (2 rows) regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Planning counters in pg_stat_statements (using pgss_store)
Hi, >> >> case avg_tps pct_diff >> 089 278 -- >> 188 745 0,6% >> 288 282 1,1% >> 386 660 2,9% >> >> This means that even in this extrem test case, the worst degradation is less >> than 3% >> (this overhead can be removed using pg_stat_statements.track_planning guc) > Is the difference between 2 and 3 the extraneous pgss_store call to > always store the query text if planner hook doesn't have access to the > query text? Yes it is, but I agree it seems a big gap (1,8%) compared to the difference between 1 and 2 (0,5%). Maybe this is just mesure "noise" ... Regards PAscal
Re: DWIM mode for psql
Andreas Karlsson wrote > On 3/31/19 10:52 PM, Thomas Munro wrote:> Building on the excellent work > begun by commit e529cd4ffa60, I would >> like to propose a do-what-I-mean mode for psql. Please find a POC >> patch attached. It works like this: >> >> postgres=# select datnaam from pg_database where ooid = 12917; >> ERROR: column "datnaam" does not exist >> LINE 1: select datnaam from pg_database where ooid = 12917; >> ^ >> HINT: Perhaps you meant to reference the column "pg_database.datname". >> postgres=# YES >> datname >> -- >> postgres >> (1 row) > > I think it is potentially confusing that YES and NO does not look like > other psql commands. Let's pick something which is more in line with > existing commands like \y and \n. > > Andreas +1 Regards >-)))°> -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Hi, I have played with this patch, it works fine. rem the last position of the "new" total_time column is confusing +CREATE VIEW pg_stat_statements AS + SELECT *, total_plan_time + total_exec_time AS total_time +FROM pg_stat_statements(true); I wanted to perform some benchmark between those 4 cases: 0 - no pgss, 1 - original pgss (no planning counter and 1 access to pgss hash), 2 - pggs reading querytext in planner hook (* 2 accesses to pgss hash), 3 - pggs reading querytext in parse hook (* 3 accesses to pgss hash) It seems that the difference is so tiny, that there was no other way than running minimal "Select 1;" statement ... ./pgbench -c 10 -j 5 -t 50 -f select1stmt.sql postgres case avg_tps pct_diff 089 278 -- 188 745 0,6% 288 282 1,1% 386 660 2,9% This means that even in this extrem test case, the worst degradation is less than 3% (this overhead can be removed using pg_stat_statements.track_planning guc) notes: - PostgreSQL 12devel on x86_64-w64-mingw32, compiled by gcc.exe (x86_64-win32-sehrev1, Built by MinGW-W64 project) 7.2.0, 64-bit, - cpu usage was less that 95%, - avg_tps is based on 3 runs, - there was a wait of arround 1 minute between each run to keep computer temperature and fan usage low. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: minimizing pg_stat_statements performance overhead
Hi, it seems that your patch is not readable. If you want it to be included in a commitfest, you should add it by yourself in https://commitfest.postgresql.org/ Not sure that there is any room left in pg12 commitfest. Regard PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: minimizing pg_stat_statements performance overhead
my test case: drop table a; create table a (); DO $$ DECLARE i int; BEGIN for i in 1..20 loop execute 'alter table a add column a'||i::text||' int'; end loop; END $$; select pg_stat_statements_reset(); set pg_stat_statements.track='none'; DO $$ DECLARE i int; j int; BEGIN for j in 1..20 loop for i in 1..20 loop execute 'select a'||i::text||',a'||j::text||' from a where 1=2'; end loop; end loop; END $$; -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
>> - trailing whitespaces and comments wider than 80 characters >> not fixed > why? In case it's not clear, I'm talking about the .c file, not the > regression tests. I work on a poor msys install on windows 7, where perl is broken ;o( So no pgindent available. Will fix that later, or as soon as I get a pgindent diff. >> - "Assert(planning_time > 0 && total_time > 0);" >> moved at the beginning of pgss_store > Have you tried to actually compile postgres and pg_stat_statements > with --enable-cassert? This test can *never* be true, since you > either provide the planning time or the execution time or neither. As > I said in my previous mail, adding a parameter to say which counter > you're updating, instead of adding another counter that's mutually > exclusive with the other would make everything clearer. Yes this "assert" is useless as is ... I'll remove it. I understand you proposal of pgss_store refactoring, but I don't have much time available now ... and I would like to check that performances are not broken before any other modification ... Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: minimizing pg_stat_statements performance overhead
Your fix is probably the best one. Maybe this could be considered as a bug and back ported to previous versions ... Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: minimizing pg_stat_statements performance overhead
Hi, the part that hurts in terms or performances is: if (jstate.clocations_count > 0) pgss_store(pstate->p_sourcetext, query->queryId, query->stmt_location, query->stmt_len, 0, 0, 0, NULL, ); that writes the query text to disk, when it has at less one parameter ... Computing the QueryId should stay very small and can be very usefull when used in conjonction with https://www.postgresql-archive.org/Feature-improvement-can-we-add-queryId-for-pg-catalog-pg-stat-activity-view-td6077275.html#a6077602 for wait events sampling. I would propose to fix this by if (jstate.clocations_count > 0 && pgss_enabled()) pgss_store(pstate->p_sourcetext, ... Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Planning counters in pg_stat_statements (using pgss_store)
here is a new version: - "track_planning" GUC added to permit to keep previous behavior unchanged - columns names have been changed / added: total_plan_time, total_exec_time, total_time - trailing whitespaces and comments wider than 80 characters not fixed - "if (jstate.clocations_count > 0) pgss_store(pstate->p_sourcetext,..." has been reverted - expose query text to the planner won't fix (out of my knowledge) - "Assert(planning_time > 0 && total_time > 0);" moved at the beginning of pgss_store Regards PAscal pgss_add_planning_counters_v3 Description: pgss_add_planning_counters_v3
Re: Planning counters in pg_stat_statements (using pgss_store)
As there are now 3 locking times on pgss hash struct, one day or an other, somebody will ask for a GUC to disable this feature (to be able to run pgss unchanged with only one lock as today). With this GUC, pgss_store should be able to store the query text and accumulated execution duration in the same call (as today). Will try to provide this soon. -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
>> Shoudn't you add this to commitfest ? > I added it last week, see https://commitfest.postgresql.org/23/2069/ Oups, sorry for the noise
RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
>> Would it make sense to add it in auto explain ? >> I don't know for explain itself, but maybe ... > I'd think that people interested in getting the queryid in the logs > would configure the log_line_prefix to display it consistently rather > than having it in only a subset of cases, so that's probably not > really needed. Ok. Shoudn't you add this to commitfest ?
RE: Planning counters in pg_stat_statements (using pgss_store)
> This patch has multiple trailing whitespace, indent and coding style > issues. You should consider running pg_indent before submitting a > patch. I attach the diff after running pgindent if you want more > details about the various issues. fixed > - * Track statement execution times across a whole database cluster. > + * Track statement planning and execution times across a whole cluster. > if we're changing this, we should also fix the fact that's it's not > tracking only the time but various resources? fixed > + /* calc differences of buffer counters. */ > + bufusage.shared_blks_hit = > + pgBufferUsage.shared_blks_hit - bufusage_start.shared_blks_hit;> > > [...] > This is an exact duplication of pgss_ProcessUtility(), it's probably > better to create a macro or a function for that instead. yes, maybe later (I don't know macros) > + pgss_store("", > + parse->queryId, /* signal that it's a > utility stmt */ > + -1, > the comment makes no sense, and also you can't pass an empty query > string / unknown len. There's no guarantee that the entry for the > given queryId won't have been evicted, and in this case you'll create > a new and unrelated entry. Fixed, comment was wrong Query text is not available in pgss_planner_hook that's why pgss_store execution is forced in pgss_post_parse_analyze (to initialize pgss entry with its query text). There is a very small risk that query has been evicted between pgss_post_parse_analyze and pgss_planner_hook. > @@ -832,13 +931,13 @@ pgss_post_parse_analyze(ParseState *pstate, Query > *query) > * the normalized string would be the same as the query text anyway, so > * there's no need for an early entry. > */ > - if (jstate.clocations_count > 0) > pgss_store(pstate->p_sourcetext, > Why did you remove this? pgss_store() isn't free, and asking to > generate a normalized query for a query that doesn't have any constant > or storing the entry early won't do anything useful AFAICT. Though if > that's useful, you definitely can't remove the test without adapting > the comment and the indentation. See explanation in previous answer (comments have been updated accordingly) > there are 4 tests to check if planning_time is zero or not, it's quite > messy. Could you refactor the code to avoid so many tests? It would > probably be useful to add some asserts to check that we don't provide > both planning_time == 0 and execution related values. The function's > comment would also need to be adapted to mention the new rationale > with planning_time. Fixed > * hash table entry for the PREPARE (with hash calculated from the query > * string), and then a different one with the same query string (but hash > * calculated from the query tree) would be used to accumulate costs of > -* ensuing EXECUTEs. This would be confusing, and inconsistent with other > -* cases where planning time is not included at all. > +* ensuing EXECUTEs. > the comment about confusing behavior is still valid. Fixed >> Columns naming has not been modified, I would propose to change it to: >> - plans: ok >> - planning_time --> plan_time >> - calls: ok >> - total_time --> exec_time >> - {min,max,mean,stddev}_time: ok >> - new total_time (being the sum of plan_time and exec_time) > plan_time and exec_time are accumulated counters, so we need to keep > the total_ prefix in any case. I think it's ok to break the function > output names if we keep some kind of compatibility at the view level > (which can keep total_time as the sum of total_plan_time and > total_exec_time), so current queries against the view wouldn't break, > and get what they probably wanted. before to change this at all (view, function, code, doc) levels, I would like to be sure that column names will be: - plans - total_plan_time - calls - total_exec_time - min_time (without exec in name) - max_time (without exec in name) - mean_time (without exec in name) - stddev_time (without exec in name) - total_time (being the sum of total_plan_time and total_exec_time) could other users confirm ? pgss_add_planning_counters_v2.diff Description: pgss_add_planning_counters_v2.diff
RE: Planning counters in pg_stat_statements (using pgss_store)
Hi, Here is a rebased and corrected version . Columns naming has not been modified, I would propose to change it to: - plans: ok - planning_time --> plan_time - calls: ok - total_time --> exec_time - {min,max,mean,stddev}_time: ok - new total_time (being the sum of plan_time and exec_time) Regards PAscal pgss_add_planning_counters_v1.diff Description: pgss_add_planning_counters_v1.diff
Re: [survey] New "Stable" QueryId based on normalized query text
Julien Rouhaud wrote > On Wed, Mar 20, 2019 at 10:30 PM legrand legrand > > legrand_legrand@ > wrote: >> >> maybe this patch (with a GUC) >> https://www.postgresql.org/message-id/ > 55E51C48.1060102@ >> would be enough for thoses actually using a text normalization function. > > The rest of thread raise quite a lot of concerns about the semantics, > the cost and the correctness of this patch. After 5 minutes checking, > it wouldn't suits your need if you use custom functions, custom types, > custom operators (say using intarray extension) or if your tables > don't have columns in the same order in every environment. And there > are probably other caveats that I didn't see; Yes I know, It would have to be extended at less at functions, types, operators ... names and a guc pg_stat_statements.queryid_based= 'names' (default being 'oids') and with a second guc ('fullyqualifed' ?) sould include tables, functions, types, operators ... namespaces let "users" specify their needs, we will see ;o) -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
Julien Rouhaud wrote > On Wed, Mar 20, 2019 at 10:18 PM legrand legrand > > legrand_legrand@ > wrote: >> >> On my personal point of view, I need to get the same Queryid between >> (OLAP) >> environments >> to be able to compare Production, Pre-production, Qualif performances >> (and I don't need Fully qualified relation names). Today to do that, >> I'm using a custom extension computing the QueryId based on the >> normalized >> Query >> text. > > IIUC, your need is to compare pgss (maybe other extensions) counters > from different primary servers, for queries generated by the same > application(s). A naive workaround like exporting each environment > counters (COPY SELECT 'production', * FROM pg_stat_statements TO > '...'), importing all of them on a server and then comparing > everything using the query text (which should be the same if the > application is the same) instead of queryid wouldn't work? Or even > using foreign tables if exporting data is too troublesome. That's > clearly not ideal, but that's an easy workaround which doesn't add any > performance hit at runtime. Thank you Julien for the workaround, It is not easy to build "cross tables" in excel to join metrics per query text ... and I'm not ready to build a MD5(query) as many query could lead to the same QueryId I've been using SQL_IDs for ten years, and I have some (who say old) habits :^) Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
maybe this patch (with a GUC) https://www.postgresql.org/message-id/55e51c48.1060...@uptime.jp would be enough for thoses actually using a text normalization function. -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
Julien Rouhaud wrote > On Wed, Mar 20, 2019 at 8:39 PM legrand legrand > > legrand_legrand@ > wrote: >> >> Yes, I would like first to understand what are the main needs, > > I don't really see one implementation that suits every need, as > probably not everyone will agree on using relation name vs fully > qualified relation name for starter. The idea to take into account or > normalise comments will also probably require a lot of argumentation > to reach a consensus. > > Also, most of what's listed here would require catcache lookup for > every objects impacted in a query, at every execution. That would be > *super* expensive (at least for OLTP workload). As far as the need is > to gather statistics like pg_stat_statements and similar extensions > are doing, current queryid semantics and underlying limitations is not > enough of a problem to justify paying that price IMHO. Or do you have > other needs and/or problems that really can't be solved with current > implementation? > > In other words, my first need would be to be able to deactivate > everything that would make queryid computation significantly more > expensive than it's today, and/or to be able to replace it with > third-party implementation. > >> >> needs.1: stable accross different databases, >> [...] >> >> >needs.7: same value on both the primary and standby? >> >> I would say yes (I don't use standby) isn't this included into needs.1 ? > > Physical replication servers have identical oids, so identical > queryid. That's obviously not true for logical replication. On my personal point of view, I need to get the same Queryid between (OLAP) environments to be able to compare Production, Pre-production, Qualif performances (and I don't need Fully qualified relation names). Today to do that, I'm using a custom extension computing the QueryId based on the normalized Query text. This way to do, seems very popular and maybe including it in core (as a dedicated extension) or proposing an alternate queryid (based on relation name) in PGSS (Guc enabled) would fullfill 95% of the needs ... I agree with you on the last point: being able to replace actual QueryId with third-party implementation IS the first need. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [survey] New "Stable" QueryId based on normalized query text
> From "Kyotaro HORIGUCHI-2" >>At Wed, 20 Mar 2019 00:23:30 +, "Tsunakawa, Takayuki" >>> From: legrand legrand [mailto:legrand_legrand@] >>> norm.9: comments aware >> Is this to distinguish queries that have different comments for optimizer >> hints? If yes, I agree. > Or, any means to give an explict query id? I saw many instances > of query that follows a comment describing a query id. Yes, in fact didn't thought about different kink of comments, but all of them >> needs.3: search_path / schema independant, >pg_stat_statements even ignores table/object/column names. there is a very interesting thread about that in "pg_stat_statements and non default search_path" https://www.postgresql.org/message-id/8f54c609-17c6-945b-fe13-8b07c0866...@dalibo.com expecting distinct QueryIds when using distinct schemas ... maybe that It should be switched to "Schema dependant" >> needs.4: pg version independant (as long as possible), >I don't think this cannot be guaranteed. maybe using a QueryId versioning GUC >> norm.1: case insensitive >> norm.2: blank reduction >> norm.3: hash algoritm ? >> norm.4: CURRENT_DATE, CURRENT_TIME, LOCALTIME, LOCALTIMESTAMP not >> normalized >> norm.5: NULL, IS NULL not normalized ? >> norm.6: booleans t, f, true, false not normalized >> norm.7: order by 1,2 or group by 1,2 should not be normalized >> norm.8: pl/pgsql anonymous blocks not normalized > pg_stat_statements can be the base of the discussion on them. OK Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
RE: [survey] New "Stable" QueryId based on normalized query text
> From: "Tsunakawa, Takayuki" >> From: legrand legrand [mailto:legrand_legrand@] >> There are many projects that use alternate QueryId >> distinct from the famous pg_stat_statements jumbling algorithm. >I'd like to welcome the standard QueryID that DBAs and extension developers can depend on. >Are you surveying the needs for you to develop the QueryID that can meet as many needs as possible? Yes, I would like first to understand what are the main needs, then identify how it would be possible to implement it in core, in a new extension or simply with a modified pg_stat_statements. (I'm just a DBA not a C developer, so it will only be restricted to very simple enhancements) >> needs.1: stable accross different databases, >Does this mean different database clusters, not different databases in a single database cluster? Same looking query should give same QueryId on any database (in the same cluster or in distinct clusters). It can be differentiated with dbid. >needs.5: minimal overhead to calculate OK will add it >needs.6: doesn't change across database server restarts Really ? does this already occurs ? >needs.7: same value on both the primary and standby? I would say yes (I don't use standby) isn't this included into needs.1 ? >> norm.9: comments aware >Is this to distinguish queries that have different comments for optimizer hints? If yes, I agree. Yes and others like playing with : set ... select /* test 1*/ ... set ... select /* test 2*/ ... -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Hi Jim, Robert, As this is a distinct subject from adding QueryId to pg_stat_activity, would it be possible to continue the discussion "new QueryId definition" (for postgres open source software) here: https://www.postgresql.org/message-id/1553029215728-0.p...@n3.nabble.com Thanks in advance. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
[survey] New "Stable" QueryId based on normalized query text
Hello, There are many projects that use alternate QueryId distinct from the famous pg_stat_statements jumbling algorithm. https://github.com/postgrespro/aqo query_hash https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Optimize.ViewPlans.html sql_hash https://github.com/ossc-db/pg_hint_plan queryid Even pg_stat_statement has a normalize function, that would answer the current question ... Here are some *needs* : needs.1: stable accross different databases, needs.2: doesn't change after database or object rebuild, needs.3: search_path / schema independant, needs.4: pg version independant (as long as possible), ... and some *normalization rules*: norm.1: case insensitive norm.2: blank reduction norm.3: hash algoritm ? norm.4: CURRENT_DATE, CURRENT_TIME, LOCALTIME, LOCALTIMESTAMP not normalized norm.5: NULL, IS NULL not normalized ? norm.6: booleans t, f, true, false not normalized norm.7: order by 1,2 or group by 1,2 should not be normalized norm.8: pl/pgsql anonymous blocks not normalized norm.9: comments aware Do not hesitate to add your thougths Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Great, thank you Julien ! Would it make sense to add it in auto explain ? I don't know for explain itself, but maybe ... Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Hello, This is available in https://github.com/legrandlegrand/pg_stat_sql_plans extension with a specific function pgssp_backend_queryid(pid) that permits to join pg_stat_activity with pg_stat_sql_plans (that is similar to pg_stat_statements) and also permits to collect samples of wait events per query id. This extension computes its own queryid based on a normalized query text (that doesn't change after table drop/create). Maybe that queryid calculation should stay in a dedicated extension, permiting to users to choose their queryid definition. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: any plan to support shared servers like Oracle in PG?
There already are solutions regarding this feature in Postgres using "connection pooler" wording see pgpool: http://www.pgpool.net/mediawiki/index.php/Main_Page pgbouncer: https://pgbouncer.github.io/ there are also discussions to include this as a core feature https://www.postgresql.org/message-id/flat/4b971a8f-ff61-40eb-8f30-7b57eb0fdf9d%40postgrespro.ru -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Thank you Sergei for your comments, > Did you register patch in CF app? I did not found entry. created today: https://commitfest.postgresql.org/22/1999/ > Currently we have pg_stat_statements 1.7 version and this patch does not > apply... will rebase and create a 1.8 version > - errmsg("could not read file \"%s\": %m", > + errmsg("could not read pg_stat_statement file \"%s\": > %m", this is a mistake, will fix > +#define PG_STAT_STATEMENTS_COLS_V1_4 25 I thought it was needed when adding new columns, isn't it ? > And this patch does not have documentation changes. will fix and will provide some kind of benchmark to compare with actual version. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Planning counters in pg_stat_statements (using pgss_store)
Hi Sergei, Thank you for this review ! >Did you register patch in CF app? I did not found entry. I think it is related to https://commitfest.postgresql.org/16/1373/ but I don't know how to link it with. Yes, there are many things to improve, but before to go deeper, I would like to know if that way to do it (with 3 access to pgss hash) has a chance to get consensus ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: explain plans with information about (modified) gucs
Tomas Vondra-4 wrote > Hello Sergei, > >> This patch correlates with my proposal >> "add session information column to pg_stat_statements" >> https://www.postgresql.org/message-id/3aa097d7-7c47-187b-5913-db8366cd4491%40gmail.com >> They both address the problem to identify the factors that make >> different execution plans for the same SQL statements. You are >> interested in the current settings that affect the execution plan, I'm >> concerned about historical data in pg_stat_statements. From my >> experience the most often offending settings are >> current_schemas/search_path and current_user. Please have in mind that >> probably the same approach that you will use to extend explain plan >> functionality will be eventually implemented to extend >> pg_stat_statements. > > Possibly, although I don't have an ambition to export the GUCs into > pg_stat_statements in this patch. There's an issue with merging > different values of GUCs in different executions of a statement, and > it's unclear how to solve that. > > [...] > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services This explain with GUCs feature can be included very easily for historical data management in pg_store_plans or pg_stat_sql_plans extensions (that both use a planid based on the normalized plan text). Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Planning counters in pg_stat_statements (using pgss_store)
Starting from https://www.postgresql.org/message-id/CAEepm%3D2vORBhWQZ1DJmKXmCVi%2B15Tgrv%2B9brHLanWU7XE_FWxQ%40mail.gmail.com Here is a patch trying to implement what was proposed by Tom Lane: "What we could/should do instead, I think, is have pgss_planner_hook make its own pgss_store() call to log the planning time results (which would mean we don't need the added PlannedStmt field at all). That would increase the overhead of this feature, which might mean that it'd be worth having a pg_stat_statements GUC to enable it. But it'd be a whole lot cleaner." Now: pgss_post_parse_analyze, initialize pgss entry with sql text, pgss_planner_hook, adds planning_time and counts, pgss_ExecutorEnd, works unchanged. but doesn't include any pg_stat_statements GUC to enable it yet. note: I didn't catch the sentence "which would mean we don't need the added PlannedStmt field at all". Regarding initial remark from Thomas Munro: "I agree with the sentiment on the old thread that {total,min,max,mean,stddev}_time now seem badly named, but adding execution makes them so long... Thoughts?" What would you think about: - userid - dbid - queryid - query - plans - plan_time - {min,max,mean,stddev}_plan_time - calls - exec_time - {min,max,mean,stddev}_exec_time - total_time (being the sum of plan_time and exec_time) - rows - ... Regards PAscal pgss_add_planning_counters.diff Description: pgss_add_planning_counters.diff