Re: Query about pg asynchronous processing support

2021-03-26 Thread legrand legrand
Hi,

You should search informations about postgres hooks like thoses used in
extension pg_stat_statements
https://github.com/postgres/postgres/blob/master/contrib/pg_stat_statements/pg_stat_statements.c

And about background capabilities as thoses used in extension pg_background
https://github.com/vibhorkum/pg_background

There are also extensions working with indexes like hypopg
https://github.com/HypoPG/hypopg

Good luck 
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: WIP: System Versioned Temporal Table

2021-01-15 Thread legrand legrand
Hello,

it seems that Oracle (11R2) doesn't add the Start and End timestamp columns 
and permit statement like

select * from tt
union
select * from tt
AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' SECOND)
minus 
select * from tt
VERSIONS BETWEEN TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' second) and
SYSTIMESTAMP;

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: pg_stat_statements oddity with track = all

2020-12-03 Thread legrand legrand
Hi Julien,

> The extra field I've proposed would increase the number of records, as it
> needs
to be a part of the key. 

To get an increase in the number of records that means that the same
statement 
would appear at top level AND nested level. This seems a corner case with
very low 
(neglectible) occurence rate. Did I miss something ?

Regards
PAscal 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: pg_stat_statements oddity with track = all

2020-12-02 Thread legrand legrand
Hi,

a crazy idea:
- add a parent_statement_id column that would be NULL for top level queries
- build statement_id for nested queries based on the merge of:
   a/ current_statement_id and parent one
or
   b/ current_statement_id and nested level.

this would offer the ability to track counters at any depth level ;o)
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-11-11 Thread legrand legrand
Hello Konstantin,
I remember testing it with pg_stat_statements (and planning counters
enabled). Maybe identifying internal queries associated with this (simple)
test case, could help dev team ?
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Is it useful to record whether plans are generic or custom?

2020-09-28 Thread legrand legrand
Hi Atsushi,

+1: Your proposal is a good answer for time based performance analysis 
(even if parsing durationor blks are not differentiated) .

As it makes pgss number of columns wilder, may be an other solution 
would be to create a pg_stat_statements_xxx view with the same key 
as pgss (dbid,userid,queryid) and all thoses new counters.

And last solution would be to display only generic counters, 
because in most cases (and per default) executions are custom ones 
(and generic counters = 0).
if not (when generic counters != 0), customs ones could be deducted from 
total_exec_time - total_generic_time, calls - generic_calls.

Good luck for this feature development
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW

2020-09-25 Thread legrand legrand
oups, sorry
so +1 for this fix

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW

2020-09-25 Thread legrand legrand
Hi,

isn't this already fixed in pg14
https://www.postgresql.org/message-id/e1k0mzg-0002vn...@gemulon.postgresql.org
?

Regards
PAscal





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [PATCH] Add features to pg_stat_statements

2020-09-24 Thread legrand legrand
Not limited to 3, Like an history table.

Will have to think if data is saved at shutdown.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [PATCH] Add features to pg_stat_statements

2020-09-23 Thread legrand legrand
Hi,
Both are probably not needed.
I would prefer it accessible in a view live

Event | date | victims 
Eviction...
Reset...
Part reset ...

As there are other needs to track reset times.
Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [PATCH] Add features to pg_stat_statements

2020-09-19 Thread legrand legrand
+1 !

An other way is to log evictions, it provides informations about time and
amount :

for (i = 0; i < nvictims; i++)
{
hash_search(pgssp_hash, [i]->key, HASH_REMOVE, NULL);
}

pfree(entries);

/* trace when evicting entries, if appening too often this can slow 
queries
...
 * increasing pg_stat_sql_plans.max value could help */
 ereport(LOG,
(errmsg("pg_stat_sql_plans evicting %d entries", nvictims),
errhidecontext(true), errhidestmt(true)));



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Is it useful to record whether plans are generic or custom?

2020-08-17 Thread legrand legrand

 I thought it might be preferable to make a GUC to enable
 or disable this feature, but changing the hash key makes
 it harder.
>>
>>> What happens if the server was running with this option enabled and then
>>> restarted with the option disabled? Firstly two entries for the same query
>>> were stored in pgss because the option was enabled. But when it's disabled
>>> and the server is restarted, those two entries should be merged into one
>>> at the startup of server? If so, that's problematic because it may take
>>> a long time.
>>
>> What would you think about a third value for this flag to handle disabled
>> case (no need to merge entries in this situation) ?
>
> Sorry I failed to understand your point. You mean that we can have another 
> flag
> to specify whether to merge the entries for the same query at that case or 
> not?
>
> If those entries are not merged, what does pg_stat_statements return?
> It returns the sum of those entries? Or either generic or custom entry is
> returned?

The idea is to use a plan_type enum with 'generic','custom','unknown' or 
'unset'.
if tracking plan_type is disabled, then plan_type='unknown',
else plan_type can take 'generic' or 'custom' value.

I'm not proposing to merge results for plan_type when disabling or enabling its 
tracking.


>>> Therefore I think that it's better and simple to just expose the number of
>>> times generic/custom plan was chosen for each query.
>>
>> I thought this feature was mainly needed to identifiy "under optimal generic
>> plans". Without differentiating execution counters, this will be simpler but
>> useless in this case ... isn't it ?

> Could you elaborate "under optimal generic plans"? Sorry, I failed to
> understand your point.. But maybe you're thinking to use this feature to
> check which generic or custom plan is better for each query?

> I was just thinking that this feature was useful to detect the case where
> the query was executed with unpected plan mode. That is, we can detect
> the unexpected case where the query that should be executed with generic
> plan is actually executed with custom plan, and vice versa.

There are many exemples in pg lists, where users comes saying that sometimes
their query takes a (very) longer time than before, without understand why.
I some of some exemples, it is that there is a switch from custom to generic 
after
n executions, and it takes a longer time because generic plan is not as good as
custom one (I call them under optimal generic plans). If pgss keeps counters
aggregated for both plan_types, I don't see how this (under optimal) can be 
shown.
If there is a line in pgss for custom and an other for generic, then it would 
be easier
to compare.

Does this makes sence ?

Regards
PAscal

> Regards,
> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION


nested queries vs. pg_stat_activity

2020-08-10 Thread legrand legrand
Hello,

An other solution is to expose nested queryid, and to join it with 
pg_stat_statements.
Actual development trying to add queryid to pg_stat_activity isn't helpfull, 
because it is only exposing top level one.
Extension pg_stat_sql_plans (github) propose a function called 
pg_backend_queryid(pid),
that gives the expected queryid (that is stored in shared memory for each 
backend) ...

Regards
PAscal


RE: Is it useful to record whether plans are generic or custom?

2020-07-30 Thread legrand legrand
>> Main purpose is to decide (1) the user interface and (2) the
>> way to get the plan type from pg_stat_statements.
>>
>> (1) the user interface
>> I added a new boolean column 'generic_plan' to both
>> pg_stat_statements view and the member of the hash key of
>> pg_stat_statements.
>>
>> This is because as Legrand pointed out the feature seems
>> useful under the condition of differentiating all the
>> counters for a queryid using a generic plan and the one
>> using a custom one.

> I don't like this because this may double the number of entries in pgss.
> Which means that the number of entries can more easily reach
> pg_stat_statements.max and some entries will be discarded.

Not all the entries will be doubled, only the ones that are prepared.
And even if auto prepare was implemented, having 5000, 1 or 2
max entries seems not a problem to me.

>> I thought it might be preferable to make a GUC to enable
>> or disable this feature, but changing the hash key makes
>> it harder.

> What happens if the server was running with this option enabled and then
> restarted with the option disabled? Firstly two entries for the same query
> were stored in pgss because the option was enabled. But when it's disabled
> and the server is restarted, those two entries should be merged into one
> at the startup of server? If so, that's problematic because it may take
> a long time.

What would you think about a third value for this flag to handle disabled
case (no need to merge entries in this situation) ?

> Therefore I think that it's better and simple to just expose the number of
> times generic/custom plan was chosen for each query.

I thought this feature was mainly needed to identifiy "under optimal generic
plans". Without differentiating execution counters, this will be simpler but
useless in this case ... isn't it ?

> Regards,

> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION

Regards
PAscal


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread legrand legrand
>> If we can store the plan for each statement, e.g., like pg_store_plans
>> extension [1] does, rather than such partial information, which would
>> be enough for your cases?

> That'd definitely address way more use cases.  Do you know if some
> benchmark were done to see how much overhead such an extension adds?

Hi Julien,
Did you asked about how overhead Auto Explain adds ?

The only extension that was proposing to store plans with a decent planid 
calculation was pg_stat_plans that is not compatible any more with recent 
pg versions for years.

We all know here that pg_store_plans, pg_show_plans, (my) pg_stat_sql_plans
use ExplainPrintPlan through Executor Hook, and that Explain is slow ...

Explain is slow because it was not designed for performances:
1/ colname_is_unique
see
https://www.postgresql-archive.org/Re-Explain-is-slow-with-tables-having-many-columns-td6047284.html

2/ hash_create from set_rtable_names
Look with perf top about
   do $$ declare i int; begin for i in 1..100 loop execute 'explain
select 1'; end loop end; $$;

I may propose a "minimal" explain that only display explain's backbone and
is much faster
see
https://github.com/legrandlegrand/pg_stat_sql_plans/blob/perf-explain/pgssp_explain.c
 
3/ All those extensions rebuild the explain output even with cached plan
queries ...
 a way to optimize this would be to build a planid during planning (using
associated hook)

4/ All thoses extensions try to rebuild the explain plan even for trivial
queries/plans 
like "select 1" or " insert into t values (,,,)" and that's not great for
high transactional 
applications ...

So yes, pg_store_plans is one of the short term answers to Andy Fan needs, 
the answer for the long term would be to help extensions to build planid and
store plans, 
by **adding a planid field in plannedstmt memory structure ** and/or 
optimizing explain command;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Is it useful to record whether plans are generic or custom?

2020-05-16 Thread legrand legrand
> To track executed plan types, I think execution layer hooks
> are appropriate.
> These hooks, however, take QueryDesc as a param and it does
> not include cached plan information.

It seems that the same QueryDesc entry is reused when executing
a generic plan.
For exemple marking queryDesc->plannedstmt->queryId (outside 
pg_stat_statements) with a pseudo tag during ExecutorStart 
reappears in later executions with generic plans ...

Is this QueryDesc reusable by a custom plan ? If not maybe a solution
could be to add a flag in queryDesc->plannedstmt ?

(sorry, I haven't understood yet how and when this generic plan is 
managed during planning)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Is it useful to record whether plans are generic or custom?

2020-05-13 Thread legrand legrand
Hello,

yes this can be usefull, under the condition of differentiating all the
counters
for a queryid using a generic plan and the one using a custom one.

For me one way to do that is adding a generic_plan column to 
pg_stat_statements key, someting like:
- userid,
- dbid,
- queryid,
- generic_plan

I don't know if generic/custom plan types are available during planning
and/or 
execution hooks ...

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread legrand legrand
Fujii Masao-4 wrote
> On 2020/04/03 16:26
> [...]
>> 
>> "Note that planning and execution statistics are updated only at their
>> respective end phase, and only for successful operations.
>> For example the execution counters of a long running query
>> will only be updated at the execution end, without showing any progress
>> report before that.
> 
> Probably since this is not the example for explaining the relationship of
> planning and execution stats, it's better to explain this separately or
> just
> drop it?
> 
> What about the attached patch based on your proposals?

+1
Your patch is perfect ;^>

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-02 Thread legrand legrand
Fujii Masao-4 wrote
> On 2020/04/01 18:19, Fujii Masao wrote:
> 
> Finally I pushed the patch!
> Many thanks for all involved in this patch!
> 
> As a remaining TODO item, I'm thinking that the document would need to
> be improved. For example, previously the query was not stored in pgss
> when it failed. But, in v13, if pgss_planning is enabled, such a query is
> stored because the planning succeeds. Without the explanation about
> that behavior in the document, I'm afraid that users will get confused.
> Thought?
> 
> Regards,
> 
> -- 
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION

Thank you all for this work and especially to Julian for its major
contribution !

Regarding the TODO point: Yes I agree that it can be improved.
My proposal:

"Note that planning and execution statistics are updated only at their 
respective end phase, and only for successfull operations.
For exemple executions counters of a long running SELECT query, 
will be updated at the execution end, without showing any progress 
report in the interval.
Other exemple, if the statement is successfully planned but fails in 
the execution phase, only its planning statistics are stored.
This may give uncorrelated plans vs calls informations."

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Patch: to pass query string to pg_plan_query()

2020-03-27 Thread legrand legrand
Tom Lane-2 wrote
> Fujii Masao 

> masao.fujii@.nttdata

>  writes:
>> Does anyone object to this patch? I'm thinking to commit it separetely
>> at first before committing the planning_counter_in_pg_stat_statements
>> patch.
> 
> I took a quick look through v9-0001-Pass-query-string-to-the-planner.patch
> and it's fine by me.  It also matches up with something I've wanted to
> do for awhile, which is to make the query string available during
> planning and execution so that we can produce error cursors for
> run-time errors, when relevant.
> 
> [...]
> 
>   regards, tom lane


Great !
Good news ;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Allow auto_explain to log plans before queries are executed

2020-03-27 Thread legrand legrand
Kyotaro Horiguchi-4 wrote
> At Thu, 27 Feb 2020 06:27:24 +0100, Pavel Stehule 

> pavel.stehule@

>  wrote in 
>> odesílatel Kyotaro Horiguchi 

> horikyota.ntt@

> 
>> napsal:
> 
> If we need a live plan dump of a running query, We could do that using
> some kind of inter-backend triggering. (I'm not sure if PG offers
> inter-backend signalling facility usable by extensions..)
> 
> =# select auto_explain.log_plan_backend(12345);
> 
> postgresql.log:
>  LOG: requested plan dump: blah, blah..
> 
> regards.
> 
> -- 
> Kyotaro Horiguchi
> NTT Open Source Software Center

Did you know
https://www.postgresql-archive.org/pg-show-plans-Seeing-all-execution-plans-at-once-td6129231.html
?

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-16 Thread legrand legrand
> I'm instead attaching a v7 which removes the assert in pg_plan_query, and
> modify pgss_planner_hook to also ignore queries without a query text, as
> this
> seems the best option. 

Ok, it was the second solution, go on !



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-14 Thread legrand legrand
> I don't know it is useful but there are also codes that avoid an error when
> sourceText is NULL.

> executor_errposition(EState *estate, int location)
> {
> ...
>/* Can't do anything if source text is not available */
>if (estate == NULL || estate->es_sourceText == NULL)
> }


or maybe would you prefer to replace the Non-Zero queryid test 
by Non-NULL sourcetext one ?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-14 Thread legrand legrand
imai.yoshik...@fujitsu.com wrote
> On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote:
>> On Thu, Mar 12, 2020 at 1:11 PM Marco Slot 

> marco.slot@

>  wrote:
>> > On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud 

> rjuju123@

> 
>> wrote:
>> > > There's at least the current version of IVM patchset that lacks the
>> > > querytext.  Looking at various extensions, I see that pg_background
>> > > and pglogical call pg_plan_query internally but shouldn't have any
>> > > issue providing the query text.  But there's also citus extension,
>> > > which don't keep around the query string at least when distributing
>> > > plans, which makes sense since it's of no use and they're heavily
>> > > modifying the original Query.  I think that citus folks opinion on
>> > > the subject would be useful, so I'm Cc-ing Marco.
>> >
>> > Most of the time we keep our Query * data structures in a form that
>> > can be deparsed back into a query string by a modified copy of
>> > ruleutils.c, so we could generate a correct query string if absolutely
>> > necessary, though there are performance-sensitive cases where we'd
>> > rather not have the deparsing overhead.
>> 
>> Yes, deparsing is probably too expensive for that use case.
>> 
>> > In case of INSERT..SELECT into a distributed table, we might call
>> > pg_plan_query on the SELECT part (Query *) and send the output into a
>> > DestReceiver that sends tuples into shards of the distributed table
>> > via COPY. The fact that SELECT does not show up in pg_stat_statements
>> > separately is generally fine because it's an implementation detail,
>> > and it would probably be a little confusing because the user never ran
>> > the SELECT query. Moreover, the call to pg_plan_query would already be
>> > reflected in the planning or execution time of the top-level query, so
>> > it would be double counted if it had its own entry.
>> 
>> Well, surprising statements can already appears in pg_stat_statements
>> when
>> you use some psql features, or if you have triggers as those will run
>> additional
>> queries under the hood.
>> 
>> The difference here is that since citus is a CustomNode, underlying calls
>> to
>> planner will be accounted for that node, and that's indeed annoying.  I
>> can see
>> that citus is doing some calls to spi_exec or
>> Executor* (in ExecuteLocalTaskPlan), which could also trigger
>> pg_stat_statements, but I don't know if a queryid is present there.
>> 
>> > Another case is when some of the shards turn out to be local to the
>> > server that handles the distributed query. In that case we plan the
>> > queries on those shards via pg_plan_query instead of deparsing and
>> > sending the query string to a remote server. It would be less
>> > confusing for these queries to show in pg_stat_statements, because the
>> > queries on the shards on remote servers will show up as well. However,
>> > this is a performance-sensitive code path where we'd rather avoid
>> > deparsing.
>> 
>> Agreed.
>> 
>> > In general, I'd prefer if there was no requirement to pass a correct
>> > query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
>> > if that does not lead to issues. Passing NULL to signal that the
>> > planner call should not be tracked separately does seem a bit cleaner.
>> 
>> That's very interesting feedback, thanks!  I'm not a fan of giving a way
>> for
>> queries to say that they want to be ignored by pg_stat_statements, but
>> double
>> counting the planning counters seem even worse, so I'm +0.5 to accept
>> NULL
>> query string in the planner, incidentally making pgss ignore those.
> 
> It is preferable that we can count various queries statistics as much as
> possible
> but if it causes overhead even when without using pg_stat_statements, we
> would
> not have to force them to set appropriate query_text.
> About settings a fixed string in query_text, I think it doesn't make much
> sense
> because users can't take any actions by seeing those queries' stats.
> Moreover, if
> we set a fixed string in query_text to avoid pg_stat_statement's errors,
> codes
> would be inexplicable for other developers who don't know there's such
> requirements.
> After all, I agree accepting NULL query string in the planner.
> 
> I don't know it is useful but there are also codes that avoid an error
> when
> sourceText is NULL.
> 
> executor_errposition(EState *estate, int location)
> {
> ...
> /* Can't do anything if source text is not available */
> if (estate == NULL || estate->es_sourceText == NULL)
> }
> 
> --
> Yoshikazu Imai

Hello Imai,

My understanding of V5 patch, that checks for Non-Zero queryid,
manage properly case where sourceText is NULL.

A NULL sourceText means that there was no Parsing for the associated 
query, if there was no Parsing, there is no queryid (queryId=0), 
and no planning counters update.

It doesn't change pg_plan_query behaviour (no regression for Citus, IVM,
...),
and was tested with success for IVM.

If my understanding 

Patch: to pass query string to pg_plan_query()

2020-03-09 Thread legrand legrand
Hello,

This is a call for committers, reviewers and users,
regarding "planning counters in pg_stat_statements"
patch [1] but not only.

Historically, this version of pg_stat_statements
with planning counters was performing 3 calls to 
pgss_store() for non utility statements in:
1 - pgss_post_parse_analyze (init entry with queryid 
and store query text)
2 - pgss_planner_hook (to store planning counters)
3 - pgss_ExecutorEnd (to store execution counters)

Then a new version was proposed to remove one call 
to pgss_store() by adding the query string to the 
planner pg_plan_query():
1 - pgss_planner_hook (to store planning counters)
2 - pgss_ExecutorEnd (to store execution counters)

Many performances tests where performed concluding
that there is no impact on this subject.

Patch "to pass query string to the planner", could be 
committed by itself, and (maybe) used by other extensions.

If this was done, this new version of pgss with planning
counters could be committed as well, or even later 
(being used as a non core extension starting with pg13). 

So please give us your feedback regarding this patch 
"to pass query string to the planner", if you have other 
use cases, or any comment regarding core architecture.

note:
A problem was discovered during IVM testing,
because some queries without sql text where planned
without being parsed, finishing in pgss with a zero 
queryid.

A work arround is to set track_planning = false,
we have chosen to fix that in pgss by ignoring
zero queryid inside pgss_planner_hook.

Thanks in advance
Regards
PAscal
  
[1] " https://www.postgresql.org/message-id/20200309103142.GA45401%40nol
  "



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-05 Thread legrand legrand
Never mind ...

Please consider PG13 shortest path ;o)

My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
It fixes IVM problem (verified), 
and keep CTAS equal to pgss without planning counters (verified too).

Regards
PAscal






--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-02 Thread legrand legrand
Julien Rouhaud wrote
> On Sun, Mar 1, 2020 at 3:55 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> >> I like the idea of adding a check for a non-zero queryId in the new
>> >> pgss_planner_hook() (zero queryid shouldn't be reserved for
>> >> utility_statements ?).
>>
>> > Some assert hit later, I can say that it's not always true.  For
>> > instance a CREATE TABLE AS won't run parse analysis for the underlying
>> > query, as this has already been done for the original statement, but
>> > will still call the planner.  I'll change pgss_planner_hook to ignore
>> > such cases, as pgss_store would otherwise think that it's a utility
>> > statement.  That'll probably incidentally fix the IVM incompatibility.
>>
>> Today with or without test on parse->queryId != UINT64CONST(0),
>> CTAS is collected as a utility_statement without planning counter.
>> This seems to me respectig the rule, not sure that this needs any
>> new (risky) change to the actual (quite stable) patch.
> 
> But the queryid ends up not being computed the same way:
> 
> # select queryid, query, plans, calls from pg_stat_statements where
> query like 'create table%';
>queryid   | query  | plans | calls
> -++---+---
>  8275950546884151007 | create table test as select 1; | 1 | 0
>  7546197440584636081 | create table test as select 1  | 0 | 1
> (2 rows)
> 
> That's because CreateTableAsStmt->query doesn't have a query
> location/len, as transformTopLevelStmt is only setting that for the
> top level Query.  That's probably an oversight in ab1f0c82257, but I'm
> not sure what's the best way to fix that.  Should we pass that
> information to all transformXXX function, or let transformTopLevelStmt
> handle that.


arf, this was not the case in my testing env (that is not up to date) :o(
and would not have appeared at all with the proposed test on 
parse->queryId != UINT64CONST(0) ...




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-01 Thread legrand legrand
>> I like the idea of adding a check for a non-zero queryId in the new
>> pgss_planner_hook() (zero queryid shouldn't be reserved for
>> utility_statements ?).

> Some assert hit later, I can say that it's not always true.  For
> instance a CREATE TABLE AS won't run parse analysis for the underlying
> query, as this has already been done for the original statement, but
> will still call the planner.  I'll change pgss_planner_hook to ignore
> such cases, as pgss_store would otherwise think that it's a utility
> statement.  That'll probably incidentally fix the IVM incompatibility. 

Today with or without test on parse->queryId != UINT64CONST(0),
CTAS is collected as a utility_statement without planning counter.
This seems to me respectig the rule, not sure that this needs any 
new (risky) change to the actual (quite stable) patch. 





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-02-28 Thread legrand legrand
Hi Julien,

>> But I would have prefered this new feature to work the same way with or
>> without track_planning activated ;o(

> Definitely, but fixing the issue in pgss (ignoring planner calls when
> we don't have a query text) means that pgss won't give an exhaustive
> view of activity anymore, so a fix in IVM would be a better solution.
> Let's wait and see if Nagata-san and other people involved in that
> have an opinion on it.

It seems IVM team does not consider this point as a priority ... 
We should not wait for them, if we want to keep a chance to be 
included in PG13.

So we have to make this feature more robust, an assert failure being 
considered as a severe regression (even if this is not coming from pgss).

I like the idea of adding a check for a non-zero queryId in the new 
pgss_planner_hook() (zero queryid shouldn't be reserved for
utility_statements ?).

Fixing the corner case where a query (with no sql text) can be planned 
without being parsed is an other subject that should be resolved in an 
other thread.

This kind of query was ignored in pgss, it should be ignored in pgss with 
planning counters.

Any thoughts ?
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-02-28 Thread legrand legrand
>> thank you for patch v14, that fix problems inherited from temporary
tables.
>> it seems that this ASSERT problem with pgss patch is still present ;o(
>> 
>
> Sorry but we are busy on fixing and improving IVM patches. I think fixing
> the assertion failure needs non trivial changes to other part of
> PosthreSQL.
> So we would like to work on the issue you reported after the pgss patch
> gets committed.

Imagine it will happen tomorrow !
You may say I'm a dreamer
But I'm not the only one
...
...





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-02-27 Thread legrand legrand
> I have tried to use an other patch with yours:
> "Planning counters in pg_stat_statements (using pgss_store)"

> setting
> shared_preload_libraries='pg_stat_statements'
> pg_stat_statements.track=all
> restarting the cluster and creating the extension


> When trying following syntax:

> create table b1 (id integer, x numeric(10,3));
> create incremental materialized view mv1 as select id, count(*),sum(x) 
> from b1 group by id;
> insert into b1 values (1,1)
>
> I got an ASSERT FAILURE in pg_stat_statements.c
> on
>   Assert(query != NULL);
>
> comming from matview.c
>   refresh_matview_datafill(dest_old, query, queryEnv, NULL);
> or
>   refresh_matview_datafill(dest_new, query, queryEnv, NULL);
>
> If this (last) NULL field was replaced by the query text, 
> a comment or just "n/a",
> it would fix the problem.

> Could this be investigated ?

Hello,

thank you for patch v14, that fix problems inherited from temporary tables.
it seems that this ASSERT problem with pgss patch is still present ;o(

Could we have a look ?

Thanks in advance
Regards
PAscal



--
Sent from:
https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Allow auto_explain to log plans before queries are executed

2020-02-27 Thread legrand legrand
Hi,

that feature for dumping plans with auto explain is already available in
https://github.com/legrandlegrand/pg_stat_sql_plans

This is an hybrid extension combining auto_explain and pg_stat_statements,
adding a planid and tracking metrics even on error, ..., ...

With 
pg_stat_sql_plans.track_planid = true
pg_stat_sql_plans.explain = true
-->  it writes explain plan in log file after planning and only one time
per (queryid,planid)
   then no need of sampling

and with
pg_stat_sql_plans.track = 'all'
  --> function pgssp_backend_queryid(pid) retrieves (nested) queryid of a
stuck statement, 
and permit to retrieve its plan (by its queryid) in logs.

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Using stat collector for collecting long SQL

2020-02-27 Thread legrand legrand
Hi

> There is a often problem with taking source of long SQL strings. The
> pg_stat_activity field query is reduced to some short limit and is not too
> practical to increase this limit.

I thought it was "old story", since that track_activity_query_size can be
increased widely:
https://www.postgresql.org/message-id/flat/7b5ecc5a9991045e2f13c84e3047541d%40postgrespro.ru

[...]

> It can be base for implementation EXPLAIN PID ?
+1 for this concept, that would be very usefull for diagnosing stuck
queries.

Until now the only proposal I saw regarding this was detailled in
https://www.postgresql.org/message-id/1582756552256-0.p...@n3.nabble.com
a prerequisite to be able to use extension postgrespro/pg_query_state

Regards 
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: PoC: custom signal handler for extensions

2020-02-26 Thread legrand legrand
Hello Maksim,

reading about
https://www.postgresql-archive.org/Allow-auto-explain-to-log-plans-before-queries-are-executed-td6124646.html
makes me think (again) about your work and pg_query_state ...

Is there a chance to see you restarting working on this patch ?
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: WIP: Aggregation push-down

2020-02-26 Thread legrand legrand
Antonin Houska-2 wrote
> Alvaro Herrera 

> alvherre@

>  wrote:
> 
>> This stuff seems very useful.  How come it sits unreviewed for so long?
> 
> I think the review is hard for people who are not interested in the
> planner
> very much. And as for further development, there are a few design
> decisions
> that can hardly be resolved without Tom Lane's comments. Right now I
> recall
> two problems: 1) is the way I currently store RelOptInfo for the grouped
> relations correct?, 2) how should we handle types for which logical
> equality
> does not imply physical (byte-wise) equality?
> 
> Fortunately it seems now that I'm not the only one who cares about 2), so
> this
> problem might be resolved soon:
> 
> https://www.postgresql.org/message-id/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ%40mail.gmail.com
> 
> But 1) still remains.
> 
> -- 
> Antonin Houska
> Web: https://www.cybertec-postgresql.com

Hello
would "pgsql: Add equalimage B-Tree support functions." 
https://www.postgresql.org/message-id/e1j72ny-0002gi...@gemulon.postgresql.org

help for 2) ?

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-02-13 Thread legrand legrand
Yugo Nagata wrote
> Thank you for your suggestion! I agree that the feature to switch between
> normal materialized view and incrementally maintainable view is useful.
> We will add this to our ToDo list. Regarding its syntax, 
> I would not like to add new keyword like NONINCREMENTAL, so how about 
> the following
> 
>  ALTER MATERIALIZED VIEW ... SET {WITH | WITHOUT} INCREMENTAL REFRESH
> 
> although this is just a idea and we will need discussion on it.

Thanks I will follow that discussion on GitHub 
https://github.com/sraoss/pgsql-ivm/issues/79

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-02-11 Thread legrand legrand
Takuma Hoshiai wrote
> Hi, 
> 
> Attached is the latest patch (v12) to add support for Incremental
> Materialized View Maintenance (IVM).
> It is possible to apply to current latest master branch.
> 
> Differences from the previous patch (v11) include:
> * support executing REFRESH MATERIALIZED VIEW command with IVM.
> * support unscannable state by WITH NO DATA option.
> * add a check for LIMIT/OFFSET at creating an IMMV
> 
>  If REFRESH is executed for IMMV (incremental maintainable materialized
> view),  its contents is re-calculated as same as usual materialized views
> (full REFRESH). Although IMMV is basically keeping up-to-date data,
> rounding errors can be accumulated in aggregated value in some cases, for
> example, if the view contains sum/avg on float type columns. Running
> REFRESH command on IMMV will resolve this. Also, WITH NO DATA option
> allows to make IMMV unscannable. At that time,  IVM triggers are dropped
> from IMMV because these become unneeded and useless. 
> 
> [...]

Hello,

regarding syntax REFRESH MATERIALIZED VIEW x WITH NO DATA

I understand that triggers are removed from the source tables, transforming 
the INCREMENTAL MATERIALIZED VIEW into a(n unscannable) MATERIALIZED VIEW.

postgres=# refresh materialized view imv with no data;
REFRESH MATERIALIZED VIEW
postgres=# select * from imv;
ERROR:  materialized view "imv" has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

This operation seems to me more of an ALTER command than a REFRESH ONE.

Wouldn't the syntax
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
SET WITH NO DATA
or
SET WITHOUT DATA
be better ?

Continuing into this direction, did you ever think about an other feature
like:  
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
SET { NOINCREMENTAL }
or even
SET { NOINCREMENTAL | INCREMENTAL | INCREMENTAL CONCURRENTLY  }

that would permit to switch between those modes and would keep frozen data 
available in the materialized view during heavy operations on source tables
?

Regards
PAscal 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: POC: rational number type (fractions)

2020-02-08 Thread legrand legrand
Hello,
It seems you are not the first to be interested in such feature.

There was a similar extension used in "incremental view maintenance"
testing:
https://github.com/nuko-yokohama/pg_fraction

didn't tryed it myself.

Regards
PAscal 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: WIP: Aggregation push-down

2020-01-31 Thread legrand legrand
Hello,

Thank you for this great feature !
I hope this will be reviewed/validated soon ;o)

Just a comment:
set enable_agg_pushdown to true;
isn't displayed in EXPLAIN (SETTINGS) syntax.


The following modification seems to fix that: 

src/backend/utils/misc/guc.c
 
{"enable_agg_pushdown", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables aggregation push-down."),
NULL,
GUC_EXPLAIN   --- line Added ---
},
_agg_pushdown,
false,
NULL, NULL, NULL
},


then
postgres=# set enable_agg_pushdown = true;
SET
postgres=# explain (settings) select 1;
QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=4)
 Settings: enable_agg_pushdown = 'on'
(2 rows)


Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-18 Thread legrand legrand
Hi Julien,

bot is still unhappy
https://travis-ci.org/postgresql-cfbot/postgresql/builds/638701399

portalcmds.c: In function ‘PerformCursorOpen’:
portalcmds.c:93:7: error: ‘queryString’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]
  plan = pg_plan_query(query, queryString, cstmt->options, params);
   ^
portalcmds.c:50:8: note: ‘queryString’ was declared here
  char*queryString;
^
cc1: all warnings being treated as errors
: recipe for target 'portalcmds.o' failed
make[3]: *** [portalcmds.o] Error 1
make[3]: Leaving directory
'/home/travis/build/postgresql-cfbot/postgresql/src/backend/commands'
common.mk:39: recipe for target 'commands-recursive' failed
make[2]: *** [commands-recursive] Error 2
make[2]: *** Waiting for unfinished jobs

regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: postgresql-13devel initDB Running in debug mode.

2020-01-18 Thread legrand legrand
Tom Lane-2 wrote
> legrand legrand 

> legrand_legrand@

>  writes:
>> after building devel snapshot from 2020-01-17 with msys,
>> initDB generates a lot of additional informations when launched:
>> [ debug output snipped ]
>> Is that the expected behavior, or just a temporary test ?
> 
> It'd be the expected behavior if you'd given a -d switch to initdb.
> 
>   regards, tom lane

so yes, its me :
-d directory 
inspite of 
-D directory

and the doc is perfect:
-d
--debug

Print debugging output from the bootstrap backend and a few other
messages of lesser interest for the general public. The bootstrap backend is
the program initdb uses to create the catalog tables. This option generates
a tremendous amount of extremely boring output.

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




postgresql-13devel initDB Running in debug mode.

2020-01-18 Thread legrand legrand
Hello,
after building devel snapshot from 2020-01-17 with msys,
initDB generates a lot of additional informations when launched:

VERSION=13devel
PGDATA=../data
share_path=C:/msys64/usr/local/pgsql/share
PGPATH=C:/msys64/usr/local/pgsql/bin
POSTGRES_SUPERUSERNAME=lemoyp
POSTGRES_BKI=C:/msys64/usr/local/pgsql/share/postgres.bki
POSTGRES_DESCR=C:/msys64/usr/local/pgsql/share/postgres.description
POSTGRES_SHDESCR=C:/msys64/usr/local/pgsql/share/postgres.shdescription
POSTGRESQL_CONF_SAMPLE=C:/msys64/usr/local/pgsql/share/postgresql.conf.sample
PG_HBA_SAMPLE=C:/msys64/usr/local/pgsql/share/pg_hba.conf.sample
PG_IDENT_SAMPLE=C:/msys64/usr/local/pgsql/share/pg_ident.conf.sample
2020-01-18 00:19:37.407 CET [10152] DEBUG:  invoking
IpcMemoryCreate(size=149061632)
2020-01-18 00:19:37.408 CET [10152] DEBUG:  could not enable Lock Pages in
Memory user right
2020-01-18 00:19:37.408 CET [10152] HINT:  Assign Lock Pages in Memory user
right to the Windows user account which runs PostgreSQL.
2020-01-18 00:19:37.408 CET [10152] DEBUG:  disabling huge pages
2020-01-18 00:19:37.414 CET [10152] DEBUG:  SlruScanDirectory invoking
callback on pg_notify/
2020-01-18 00:19:37.414 CET [10152] DEBUG:  removing file "pg_notify/"
2020-01-18 00:19:37.416 CET [10152] DEBUG:  dynamic shared memory system
will support 308 segments
2020-01-18 00:19:37.416 CET [10152] DEBUG:  created dynamic shared memory
control segment 718036776 (7408 bytes)
2020-01-18 00:19:37.416 CET [10152] DEBUG:  transaction ID wrap limit is
2147483650, limited by database with OID 1
2020-01-18 00:19:37.416 CET [10152] DEBUG:  MultiXactId wrap limit is
2147483648, limited by database with OID 1
2020-01-18 00:19:37.416 CET [10152] DEBUG:  creating and filling new WAL
file
2020-01-18 00:19:37.446 CET [10152] DEBUG:  could not remove file
"pg_wal/00010001": No such file or directory
2020-01-18 00:19:37.454 CET [10152] DEBUG:  mapped win32 error code 5 to 13
2020-01-18 00:19:37.455 CET [10152] DEBUG:  done creating and filling new
WAL file
2020-01-18 00:19:37.467 CET [10152] DEBUG:  InitPostgres
2020-01-18 00:19:37.467 CET [10152] DEBUG:  my backend ID is 1
2020-01-18 00:19:37.467 CET [10152] NOTICE:  database system was shut down
at 2020-01-18 00:19:37 CET
2020-01-18 00:19:37.467 CET [10152] DEBUG:  mapped win32 error code 2 to 2
2020-01-18 00:19:37.471 CET [10152] DEBUG:  checkpoint record is at
0/128
2020-01-18 00:19:37.471 CET [10152] DEBUG:  redo record is at 0/128;
shutdown true
...

Is that the expected behavior, or just a temporary test ?
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: pg13 PGDLLIMPORT list

2020-01-18 Thread legrand legrand
Michael Paquier-2 wrote
> On Fri, Jan 17, 2020 at 03:07:48PM -0700, legrand legrand wrote:
> [...]
> 
> No objections from me to add both to what's imported.  Do you have a 
> specific use-case in mind for an extension on Windows?  Just
> wondering..
> --
> Michael
> 
> signature.asc (849 bytes)
> https://www.postgresql-archive.org/attachment/6119761/0/signature.asc;

No specific use-case, just the need to test features (IVM, push agg to base
relations and joins, ...)
on a professional laptop in windows 10 for a nomad app that collects
performance metrics on Oracle 
databases.

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




pg13 PGDLLIMPORT list

2020-01-17 Thread legrand legrand
Hello,
would it be possible to add PGDLLIMPORT to permit to build following
extensions on windows

pg_stat_sql_plans:
src/include/pgstat.h
extern PGDLLIMPORT bool pgstat_track_activities;

pg_background:
src/include/storage/proc.h
extern PGDLLIMPORT int  StatementTimeout;

Thanks in advance
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2020-01-17 Thread legrand legrand
Hello,

It seems that patch v11 doesn't apply any more.
Problem with "scanRTEForColumn" maybe because of change:

https://git.postgresql.org/pg/commitdiff/b541e9accb28c90656388a3f827ca3a68dd2a308

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: WIP: System Versioned Temporal Table

2020-01-05 Thread legrand legrand
Vik Fearing-4 wrote
> On 05/01/2020 16:01, legrand legrand wrote:
> 
> 
> No, that is incorrect.  The standard syntax is:
> 
> 
>     FROM tablename FOR SYSTEM_TIME AS OF '...'
> 
>     FROM tablename FOR SYSTEM_TIME BETWEEN '...' AND '...'
> 
>     FROM tablename FOR SYSTEM_TIME FROM '...' TO '...'
> 
> -- 
> 
> Vik Fearing

oups, I re-read links and docs and I'm wrong.
Sorry for the noise

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread legrand legrand
Julien Rouhaud wrote
> On Sun, Jan 5, 2020 at 4:11 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> Hi Julien,
>>
>> I would like to create a link with
>> https://www.postgresql.org/message-id/

> 1577490124579-0.post@.nabble

>>
>> where we met an ASSET FAILURE because query text was not initialized ...
>>
>> The question raised is:
>>
>> - should query text be always provided
>> or
>> - if not how to deal that case (in pgss).
> 
> I'd think that since the query text was until now always provided,
> there's no reason why this patch should change that.  That being said,
> there has been other concerns raised wrt. temporary tables in the IVM
> patchset, so ISTM that there might be important architectural changes
> upcoming, so having to deal with this case in pgss is not rushed
> (especially since handling that in pgss would be trivial), and can
> help to catch issue with the query text pasing.

IVM revealed that ASSERT,
but IVM works fine with pg_stat_statements.track_planning = off.
There may be others parts of postgresql that would have workede fine as
well.

This means 2 things:
- there is a (litle) risk to meet other assert failures when using planning
counters in pgss,
- we have an easy workarround to fix it (disabling track_planning).

But I would have prefered this new feature to work the same way with or
without track_planning activated ;o(



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread legrand legrand
Hi Julien,

I would like to create a link with
https://www.postgresql.org/message-id/1577490124579-0.p...@n3.nabble.com

where we met an ASSET FAILURE because query text was not initialized ...

The question raised is:

- should query text be always provided 
or 
- if not how to deal that case (in pgss).

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: WIP: System Versioned Temporal Table

2020-01-05 Thread legrand legrand
Vik Fearing-4 wrote
> On 05/01/2020 11:16, Surafel Temesgen wrote:
>>
>>
>> On Fri, Jan 3, 2020 at 4:22 PM Vik Fearing
>> 

> vik.fearing@

>  mailto:

> vik.fearing@

> > wrote:
>>
> 
> [...]
> 
> You only test FROM-TO and with a really wide interval.  There are no
> tests for AS OF and no tests for BETWEEN-AND.
> 
> 
> As for the syntax, you have:
> 
> 
> select a from for stest0 system_time from '2000-01-01 00:00:00.0' to
> 'infinity' ORDER BY a;
> 
> 
> when you should have:
> 
> 
> select a from stest0 for system_time from '2000-01-01 00:00:00.0' to
> 'infinity' ORDER BY a;
> 
> 
> That is, the FOR should be on the other side of the table name.
> 
> [...] 
> 
> Vik Fearing

Hello,

I though that standard syntax was "AS OF SYSTEM TIME"
as discussed here
https://www.postgresql.org/message-id/flat/A254CDC3-D308-4822-8928-8CC584E0CC71%40elusive.cx#06c5dbffd5cfb9a20cdeec7a54dc657f
, also explaining how to parse such a syntax .

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Incremental View Maintenance: ERROR: out of shared memory

2019-12-28 Thread legrand legrand
Hello 
here is an unexpected error found while testing IVM v11 patches

create table b1 (id integer, x numeric(10,3));
create incremental materialized view mv1 
as select id, count(*),sum(x) from b1 group by id;

do $$ 
declare 
i integer;
begin 
for i in 1..1 
loop 
insert into b1 values (1,1); 
end loop; 
end;
$$
;

ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.
CONTEXT:  SQL statement "DROP TABLE pg_temp_3.pg_temp_66154"
SQL statement "insert into b1 values (1,1)"
PL/pgSQL function inline_code_block line 1 at SQL statement

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2019-12-27 Thread legrand legrand
Hello,
Thank you for this patch.

I have tried to use an other patch with yours:
"Planning counters in pg_stat_statements (using pgss_store)"
https://www.postgresql.org/message-id/CAOBaU_Y12bn0tOdN9RMBZn29bfYYH11b2CwKO1RO7dX9fQ3aZA%40mail.gmail.com

setting
shared_preload_libraries='pg_stat_statements'
pg_stat_statements.track=all
and creating the extension


When trying following syntax:

create table b1 (id integer, x numeric(10,3));
create incremental materialized view mv1 as select id, count(*),sum(x) from
b1 group by id;
insert into b1 values (1,1)

I got an ASSERT FAILURE in pg_stat_statements.c
on
Assert(query != NULL);

comming from matview.c
refresh_matview_datafill(dest_old, query, queryEnv, NULL);
or
refresh_matview_datafill(dest_new, query, queryEnv, NULL);


If this (last) NULL field was replaced by the query text, a comment or just
"n/a",
it would fix the problem.

Could this be investigated ?

Thanks in advance
Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2019-12-24 Thread legrand legrand
Yugo Nagata wrote
> On Mon, 23 Dec 2019 03:41:18 -0700 (MST)
> legrand legrand 

> legrand_legrand@

>  wrote:
> 
> [ ...]
> 
>> I would even
>> prefer a common "table" shared between all sessions like GLOBAL TEMPORARY
>> TABLE (or something similar) as described here:
>> https://www.postgresql.org/message-id/flat/157703426606.1198.2452090605041230054.pgcf%40coridan.postgresql.org#331e8344bbae904350af161fb43a0aa6
> 
> Although I have not looked into this thread, this may be help if this is
> implemented. However, it would be still necessary to truncate the table
> before the view maintenance because such tables always exist and can be
> accessed and modified by any users.
> 
> -- 
> Yugo Nagata 

> nagata@.co

> 

For information, in this table data is PRIVATE to each session, can be
purged on the ON COMMIT event and disappear at SESSION end.
Yes, this feature could be utile only if it's implemented. And you are rigth
some data has to be deleted
on the ON STATEMENT event (not sure if TRUNCATE is Global or Session
specific in this situation).





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2019-12-23 Thread legrand legrand
Hello,
regarding my initial post:

> For each insert into a base table there are 3 statements:
> - ANALYZE pg_temp_3.pg_temp_81976
> - WITH updt AS (  UPDATE public.mv1 AS mv SET __ivm_count__ = ...
> - DROP TABLE pg_temp_3.pg_temp_81976

For me there where 3 points to discuss:
- create/drop tables may bloat dictionnary tables 
- create/drop tables prevents "WITH updt ..." from being shared  (with some
plan caching)
- generates many lines in pg_stat_statements

In fact I like the idea of a table created per session, but I would even
prefer a common "table" shared between all sessions like GLOBAL TEMPORARY
TABLE (or something similar) as described here:
https://www.postgresql.org/message-id/flat/157703426606.1198.2452090605041230054.pgcf%40coridan.postgresql.org#331e8344bbae904350af161fb43a0aa6

That would remove the drop/create issue, permits to reduce planning time for
"WITH updt ..." statements
(as done today in PLpgsql triggers), and would fix the pgss "bloat" issue. 

Like that the "cost" of the immediate refresh approach would be easier to
support ;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Implementing Incremental View Maintenance

2019-12-22 Thread legrand legrand
Hello,

First of all many thanks for this Great feature 
replacing so many triggers by a so simple syntax ;o)

I was wondering about performances and add a look 
at pg_stat_statements (with track=all) with IVM_v9.patch.

For each insert into a base table there are 3 statements:
- ANALYZE pg_temp_3.pg_temp_81976
- WITH updt AS (  UPDATE public.mv1 AS mv SET __ivm_count__ = ...
- DROP TABLE pg_temp_3.pg_temp_81976

It generates a lot of lines in pg_stat_statements with calls = 1.
Thoses statements can not be shared because the temp table is dropped each
time.

Is there a plan to change this ?

Many Thanks again

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Columns correlation and adaptive query optimization

2019-10-14 Thread legrand legrand
Hello Konstantin,

What you have proposed regarding join_selectivity and multicolumn statistics
is a very good new !

Regarding your auto_explain modification, maybe an "advisor" mode would also
be helpfull (with auto_explain_add_statistics_threshold=-1 for exemple).
This would allow to track which missing statistic should be tested (manually
or in an other environment).

In my point of view this advice should be an option of the EXPLAIN command,
that should also permit
auto_explain module to propose "learning" phase.

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: How to install login_hook in Postgres 10.5

2019-10-05 Thread legrand legrand
pavan95 wrote
> Hello Community,
> 
> While I was searching for logon trigger in postgres similar to that of
> Oracle, I came across "login_hook", which can be installed as a  Postgres
> database extension to mimic a logon trigger.
> 
> But I tried to install but failed. Error is that it could not find its .so
> file.  Could you please help me in installing this login_hook ??
> 
> Looking forward to hear from you. Any suggestions would be of great help!
> 
> Thanks & Regards,
> Pavan
> 
> 
> 
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

issue
ERROR:  could not access file "login_hook.so": No such file or directory
has been fixed see:
https://github.com/splendiddata/login_hook/issues/1

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Hooks for session start and end, take two

2019-09-27 Thread legrand legrand
Michael Paquier-2 wrote
> On Thu, Sep 26, 2019 at 09:57:57AM -0700, legrand legrand wrote:
>> Does that mean that all processes seen in pg_stat_activity like
>> - autovacuum launcher
>> - logical replication launcher
>> - background writer
>> - checkpointer
>> - walwriter
>> ... 
>> - Parallel worker
>> are available with that hook (it seems not) ?
> 
> All processes using PostgresMain() for their startup take this code
> path like WAL senders and normal backend sessions, but not things
> going through StartChildProcess() (WAL receiver, bgwriter, etc.) or
> other processes like autovacuum processes which use a different start
> code path.

OK I confirm:
- "client backend" appears at session start and end hook,
- "autovacuum worker" and "pg_background" only appears at session end hook
  (backend_start can be retreived from pg_stat_activity),
- "parallel workers" are not visible at all
  (because extension cannot assign XIDs during a parallel operation)

All seems fine to me.

Regards
PAscal






--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Hooks for session start and end, take two

2019-09-26 Thread legrand legrand
Hello,

Thank you for the work done on this subject.
After starting to play with it, I have a question and a remark:

> - previous hook calls were only called for normal backends, which was
> incorrect as we define the backend so as we apply no backend-related
> filtering for the hook. 

Does that mean that all processes seen in pg_stat_activity like
- autovacuum launcher
- logical replication launcher
- background writer
- checkpointer
- walwriter
... 
- Parallel worker
are available with that hook (it seems not) ?

The patch refers to a 2017 copyright, that's all I found yet ;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Proposal: Better query optimization for "NOT IN" clause

2019-09-23 Thread legrand legrand
Hello,

Just for information there are some works regarding how to include this in
core,
that may interest you ;o)

see "NOT IN subquery optimization"
https://www.postgresql.org/message-id/flat/1550706289606-0.post%40n3.nabble.com

commitfest entry: https://commitfest.postgresql.org/24/2023/

and "Converting NOT IN to anti-joins during planning"
https://www.postgresql.org/message-id/flat/CAKJS1f82pqjqe3WT9_xREmXyG20aOkHc-XqkKZG_yMA7JVJ3Tw%40mail.gmail.com

commitfest entry: https://commitfest.postgresql.org/24/2020/

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [survey] New "Stable" QueryId based on normalized query text

2019-08-12 Thread legrand legrand
my understanding is

* pg_stat_statements.track = 'none' or 'top' (default) or 'all' 
to make queryId optionally computed

* a new GUC: pg_stat_statements.queryid_based = 'oids' (default) or 'names'
or 'fullnames'
to choose the queryid computation algorithm

am I rigth ?




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [survey] New "Stable" QueryId based on normalized query text

2019-08-10 Thread legrand legrand
Hi Jim,

Its never too later, as nothing has been concluded about that survey ;o)

For information, I thought It would be possible to get a more stable
QueryId,
by hashing relation name or fully qualified names.

With the support of Julien Rouhaud, I tested with this kind of code:

case RTE_RELATION:
if (pgss_queryid_oid)
{
APP_JUMB(rte->relid);
}
else
{
rel = RelationIdGetRelation(rte->relid);

APP_JUMB_STRING(RelationGetRelationName(rel));

APP_JUMB_STRING(get_namespace_name(get_rel_namespace(rte->relid)));
RelationClose(rel);
{

thinking that 3 hash options would be interesting in pgss:
1- actual OID
2- relation names only (for databases WITHOUT distinct schemas contaning
same tables)
3- fully qualified names schema.relname (for databases WITH distinct schemas
contaning same tables)

but performances where quite bad (it was a few month ago, but I remenber
about a 1-5% decrease).
I also remenber that's this was not portable between distinct pg versions
11/12
and also not sure it was stable between windows / linux ports ...

So I stopped here ... Maybe its time to test deeper this alternative 
(to get fully qualified names hashes in One call) knowing that such
transformations 
will have to be done for all objects types (not only relations) ?

I'm ready to continue testing as it seems the less impacting solution to
keep actual pgss ...

If this doesn't work, then trying with a normalized query text (associated
with search_path) would be the 
other alternative, but impacts on actual pgss would be higher ... 

Regards
PAscal





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-08-05 Thread legrand legrand
Kyotaro Horiguchi-4 wrote
> At Sun, 4 Aug 2019 00:04:01 -0700 (MST), legrand legrand 

> legrand_legrand@

>  wrote in <

> 1564902241482-0.post@.nabble

>>
>> >  However having the nested queryid in 
>> > pg_stat_activity would be convenient to track
>> > what is a long stored functions currently doing.
>> 
>> +1
>> 
>> And  this could permit to get wait event sampling per queryid when
>> pg_stat_statements.track = all
> 
> I'm strongly on this side emotionally, but also I'm on Tom and
> Andres's side that exposing querid that way is not the right
> thing.
> 
> Doing that means we don't need exact correspondence between
> top-level query and queryId (in nested or multistatement queries)
> in this patch.  pg_stat_statements will allow us to do the same
> thing by having additional uint64[MaxBackends] array in
> pgssSharedState, instead of expanding PgBackendStatus array in
> core by the same size.
> 
> regards.
> 
> -- 
> Kyotaro Horiguchi
> NTT Open Source Software Center

Hi Kyotaro,
Thank you for this answer.
What you propose here is already available 
Inside pg_stat_sql_plans extension (a derivative from 
Pg_stat_statements and pg_store_plans)
And I’m used to this queryid behavior with top Level
Queries...
My emotion was high but I will accept it !
Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-08-04 Thread legrand legrand
>  However having the nested queryid in 
> pg_stat_activity would be convenient to track
> what is a long stored functions currently doing.

+1

And  this could permit to get wait event sampling per queryid when
pg_stat_statements.track = all

Regards 
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: How to install login_hook in Postgres 10.5

2019-08-01 Thread legrand legrand
Hello,

shouldn't we update associated commitfest entry
https://commitfest.postgresql.org/15/1318/

to give it a chance to be included in pg13 ?

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Adaptive query optimization

2019-06-12 Thread legrand legrand
>>I tried to create much simpler version of AQO based on auto_explain 
>>extension.
>>This extension provide all necessary infrastructure to analyze 
>>statements with long execution time.
>>I have added two new modes to auto_explain:
>>1. Auto generation of multicolumn statistics for variables using in 
>>clauses with large estimation error.

>Interesting! I probably wouldn't consider this part of adaptive query
>optimization, but it probably makes sense to make it part of this. I
>wonder if we might improve this to also suggest "missing" indexes? 

Shouldn't this be extended to adjust the default_statistics_target
configuration 
variable,  or on a column-by-column basis by setting the per-column
statistics
target with ALTER TABLE ... ALTER COLUMN ... SET STATISTICS ?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Adaptive query optimization

2019-06-11 Thread legrand legrand
Hello,
this seems very interesting and make me think about 2 other projets:
- https://github.com/trustly/pg_badplan
- https://github.com/ossc-db/pg_plan_advsr

As I understand all this, there are actually 3 steps:
- compare actual / estimated rows
- suggests some statistics gathering modification
- store optimised plans (or optimized stats) for reuse

I really like the "advisor Idea", permitting to identify where statistics
are wrong 
and how to fix them with ANALYZE command.
Is there a chance that some futur Optimizer  / statistics improvements
make thoses "statistics advices" enough to get good plans (without having to
store live stats 
or optimized plan) ?

Thanks in advance
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: How to install login_hook in Postgres 10.5

2019-05-13 Thread legrand legrand
Hello,

This extension https://github.com/splendiddata/login_hook
seems very interesting !
But I didn't test it myself and maybe the best place to ask 
support is there
https://github.com/splendiddata/login_hook/issues

For information there is something equivalent in core
"[PATCH] A hook for session start"
https://www.postgresql.org/message-id/flat/20171103164305.1f952c0f%40asp437-24-g082ur#d897fd6ec551d242493815312de5f5d5

that finished commited
"pgsql: Add hooks for session start and session end"
https://www.postgresql.org/message-id/flat/575d6fa2-78d0-4456-8600-302fc35b2591%40dunslane.net#0819e315c6e44c49a36c69080cab644d

but was finally rollbacked because it didn't pass installcheck test
...

Maybe you are able to patch your pg installation, 
it would be a solution of choice (there is a nice exemple 
of extension included)

Showing interest for this may also help getting this feature back ;o)

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Re: Logging the feature of SQL-level read/write commits

2019-05-05 Thread legrand legrand
Hi,

good point !
wal2Json seems to correspond to your needs, 
this is first designed for Change Data Capture, 
taht could generate a (very) big size of logs.

You didn't tell us much about your use case ...
and maybe, if the number of data modifications 
is not too big, and the number of tables to track 
limited, an history_table fed with (after insert, 
after update, before delete) 
triggers an other (simpler) solution ?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Logging the feature of SQL-level read/write commits

2019-05-05 Thread legrand legrand
Hello,

may be you can find more informations regarding WAL concepts in
Write Ahead Logging — WAL
http://www.interdb.jp/pg/pgsql09.html

It  seems very complicated to change WAL format ...

Maybe there are other solutions to answer your need,
I found many interesting solutions in postgres archives searching
with words "flashback", "timetravel", "tablelog", ...

My prefered is "AS OF queries"
https://www.postgresql.org/message-id/flat/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru

and more specificaly syntax
VERSIONS BETWEEN SYSTEM TIME ... AND ...
that would permit to get history of modifications for some rows
and help fixing current data

Regards
PAscal
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: minimizing pg_stat_statements performance overhead

2019-04-08 Thread legrand legrand
CF entry created 
https://commitfest.postgresql.org/23/2092/

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: minimizing pg_stat_statements performance overhead

2019-04-03 Thread legrand legrand
Robert Haas wrote
> On Tue, Apr 2, 2019 at 5:37 AM Christoph Berg 

> myon@

>  wrote:
>> Re: Raymond Martin 2019-04-01 

> BN8PR21MB121708579A3782866DF1F745B1550@.outlook

> 
>> > Thanks again Fabien. I am attaching the patch to this email in the hope
>> of getting it approved during the next commit fest.
>>
>> you sent the patch as UTF-16, could you re-send it as plain ascii?
> 
> One thing that needs some thought here is what happens if the value of
> pgss_enabled() changes.  For example we don't want a situation where
> if the value changes from off to on between one stage of processing
> and another, the server crashes.
> 
> I don't know whether that's a risk here or not; it's just something to
> think about.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Hi, here is a simple test where I commented that line
in pgss_post_parse_analyze
   to force return; (as if pgss_enabled() was disabled)
but kept pgss_enabled() every where else 

/* Safety check... */
//  if (!pgss || !pgss_hash || !pgss_enabled())
return;

This works without crash as you can see here after:


postgres=# select pg_stat_statements_reset();
 pg_stat_statements_reset
--

(1 row)


postgres=# show pg_stat_statements.track;
 pg_stat_statements.track
--
 all
(1 row)


postgres=# create table a(id int);
CREATE TABLE


postgres=# select * from a where id=1;
 id

(0 rows)


postgres=# select queryid,query,calls from pg_stat_statements;
   queryid   | query | calls
-+---+---
 1033669194118974675 | show pg_stat_statements.track | 1
 3022461129400094363 | create table a(id int)| 1
(2 rows)

regards
PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-04-02 Thread legrand legrand
Hi,

>>
>> case  avg_tps   pct_diff
>> 089 278   --
>> 188 745   0,6%
>> 288 282   1,1%
>> 386 660   2,9%
>>
>> This means that even in this extrem test case, the worst degradation is less
>> than 3%
>> (this overhead can be removed using pg_stat_statements.track_planning guc)

> Is the difference between 2 and 3 the extraneous pgss_store call to
> always store the query text if planner hook doesn't have access to the
> query text?

Yes it is,
but I agree it seems a big gap (1,8%) compared to the difference between 1 and 
2 (0,5%).
Maybe this is just mesure "noise" ...

Regards
PAscal


Re: DWIM mode for psql

2019-04-01 Thread legrand legrand
Andreas Karlsson wrote
> On 3/31/19 10:52 PM, Thomas Munro wrote:> Building on the excellent work 
> begun by commit e529cd4ffa60, I would
>> like to propose a do-what-I-mean mode for psql.  Please find a POC
>> patch attached.  It works like this:
>> 
>> postgres=# select datnaam from pg_database where ooid = 12917;
>> ERROR:  column "datnaam" does not exist
>> LINE 1: select datnaam from pg_database where ooid = 12917;
>> ^
>> HINT:  Perhaps you meant to reference the column "pg_database.datname".
>> postgres=# YES
>>   datname
>> --
>>   postgres
>> (1 row)
> 
> I think it is potentially confusing that YES and NO does not look like 
> other psql commands. Let's pick something which is more in line with 
> existing commands like \y and \n.
> 
> Andreas

+1
Regards
>-)))°>



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2019-04-01 Thread legrand legrand
Hi,

I have played with this patch, it works fine.

rem the last position of the "new" total_time column is confusing
+CREATE VIEW pg_stat_statements AS
+  SELECT *, total_plan_time + total_exec_time AS total_time
+FROM pg_stat_statements(true);

I wanted to perform some benchmark between those 4 cases:
0 - no pgss,
1 - original pgss (no planning counter and 1 access to pgss hash),
2 - pggs reading querytext in planner hook (* 2 accesses to pgss hash),
3 - pggs reading querytext in parse hook (* 3 accesses to pgss hash)

It seems that the difference is so tiny, that there was no other way than
running 
minimal "Select 1;" statement ...

./pgbench -c 10 -j 5 -t 50 -f select1stmt.sql postgres

case  avg_tps   pct_diff
089 278   --
188 745   0,6%
288 282   1,1%
386 660   2,9%

This means that even in this extrem test case, the worst degradation is less
than 3%
(this overhead can be removed using pg_stat_statements.track_planning guc)

notes:
- PostgreSQL 12devel on x86_64-w64-mingw32, compiled by gcc.exe 
(x86_64-win32-sehrev1, Built by MinGW-W64 project) 7.2.0, 64-bit,
- cpu usage was less that 95%,
- avg_tps is based on 3 runs,
- there was a wait of arround 1 minute between each run to keep 
  computer temperature and fan usage low.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: minimizing pg_stat_statements performance overhead

2019-04-01 Thread legrand legrand
Hi,

it seems that your patch is not readable.
If you want it to be included in a commitfest, you should add it by yourself
in https://commitfest.postgresql.org/

Not sure that there is any room left in pg12 commitfest.

Regard
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: minimizing pg_stat_statements performance overhead

2019-03-27 Thread legrand legrand
my test case:

drop table a;
create table a ();

DO
$$
DECLARE
i int;
BEGIN
for i in 1..20
loop
execute 'alter table a add column a'||i::text||' int';
end loop;
END
$$;

select  pg_stat_statements_reset();
set pg_stat_statements.track='none';

DO
$$
DECLARE
i int;
j int;
BEGIN
for j in 1..20
loop
for i in 1..20
loop
execute 'select a'||i::text||',a'||j::text||' from a where 1=2';
end loop;
end loop;
END
$$;




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2019-03-27 Thread legrand legrand
>> - trailing whitespaces and comments wider than 80 characters
>>  not fixed

> why?  In case it's not clear, I'm talking about the .c file, not the
> regression tests.

I work on a poor msys install on windows 7, where perl is broken ;o(
So no pgindent available. 
Will fix that later, or as soon as I get a pgindent diff.

>> - "Assert(planning_time > 0 && total_time > 0);"
>>  moved at the beginning of pgss_store

> Have you tried to actually compile postgres and pg_stat_statements
> with --enable-cassert?  This test can *never* be true, since you
> either provide the planning time or the execution time or neither.  As
> I said in my previous mail, adding a parameter to say which counter
> you're updating, instead of adding another counter that's mutually
> exclusive with the other would make everything clearer.

Yes this "assert" is useless as is ... I'll remove it.
I understand you proposal of pgss_store refactoring, but I don't have 
much time available now ... and I would like to check that performances 
are not broken before any other modification ...

Regards
PAscal








--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: minimizing pg_stat_statements performance overhead

2019-03-27 Thread legrand legrand
Your fix is probably the best one.
Maybe this could be considered as a bug and back ported to previous versions
...

Regards
PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: minimizing pg_stat_statements performance overhead

2019-03-27 Thread legrand legrand
Hi,
the part that hurts in terms or performances is:

if (jstate.clocations_count > 0)
pgss_store(pstate->p_sourcetext,
   query->queryId,
   query->stmt_location,
   query->stmt_len,
   0,
   0,
   0,
   NULL,
   );

that writes the query text to disk, when it has at less one parameter ...
Computing the QueryId should stay very small and can be very usefull when
used in conjonction with
https://www.postgresql-archive.org/Feature-improvement-can-we-add-queryId-for-pg-catalog-pg-stat-activity-view-td6077275.html#a6077602
for wait events sampling.

I would propose to fix this by
if (jstate.clocations_count > 0 && pgss_enabled())
pgss_store(pstate->p_sourcetext,
...

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-03-26 Thread legrand legrand
here is a new version:

- "track_planning" GUC added
 to permit to keep previous behavior unchanged
- columns names have been changed / added:
 total_plan_time, total_exec_time, total_time
- trailing whitespaces and comments wider than 80 characters
 not fixed
- "if (jstate.clocations_count > 0) pgss_store(pstate->p_sourcetext,..."
 has been reverted
- expose query text to the planner
 won't fix (out of my knowledge)
- "Assert(planning_time > 0 && total_time > 0);"
 moved at the beginning of pgss_store

Regards
PAscal


pgss_add_planning_counters_v3
Description: pgss_add_planning_counters_v3


Re: Planning counters in pg_stat_statements (using pgss_store)

2019-03-25 Thread legrand legrand
As there are now 3 locking times on pgss hash struct, one day or an other, 
somebody will ask for a GUC to disable this feature (to be able to run pgss
unchanged with only one lock as today).

With this GUC, pgss_store should be able to store the query text and
accumulated 
execution duration in the same call (as today).

Will try to provide this soon.




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-25 Thread legrand legrand
>> Shoudn't you add this to commitfest ?

> I added it last week, see https://commitfest.postgresql.org/23/2069/

Oups, sorry for the noise


RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-25 Thread legrand legrand

>> Would it make sense to add it in auto explain ?
>> I don't know for explain itself, but maybe ...

> I'd think that people interested in getting the queryid in the logs
> would configure the log_line_prefix to display it consistently rather
> than having it in only a subset of cases, so that's probably not
> really needed.

Ok.
Shoudn't you add this to commitfest ?



RE: Planning counters in pg_stat_statements (using pgss_store)

2019-03-23 Thread legrand legrand
> This patch has multiple trailing whitespace, indent and coding style
> issues.  You should consider running pg_indent before submitting a
> patch.  I attach the diff after running pgindent if you want more
> details about the various issues.

fixed


> - * Track statement execution times across a whole database cluster.
> + * Track statement planning and execution times across a whole cluster.

> if we're changing this, we should also fix the fact that's it's not
> tracking only the time but various resources?

fixed


> +   /* calc differences of buffer counters. */
> +   bufusage.shared_blks_hit =
> +   pgBufferUsage.shared_blks_hit - bufusage_start.shared_blks_hit;> >
> [...]

> This is an exact duplication of pgss_ProcessUtility(), it's probably
> better to create a macro or a function for that instead.

yes, maybe later (I don't know macros)


> +   pgss_store("",
> +  parse->queryId,  /* signal that it's a
> utility stmt */
> +  -1,

> the comment makes no sense, and also you can't pass an empty query
> string / unknown len.  There's no guarantee that the entry for the
> given queryId won't have been evicted, and in this case you'll create
> a new and unrelated entry.

Fixed, comment was wrong
Query text is not available in pgss_planner_hook
that's why pgss_store execution is forced in pgss_post_parse_analyze
(to initialize pgss entry with its query text).

There is a very small risk that query has been evicted between
pgss_post_parse_analyze and pgss_planner_hook.



> @@ -832,13 +931,13 @@ pgss_post_parse_analyze(ParseState *pstate, Query 
> *query)
>  * the normalized string would be the same as the query text anyway, so
>  * there's no need for an early entry.
>  */
> -   if (jstate.clocations_count > 0)
> pgss_store(pstate->p_sourcetext,

> Why did you remove this?  pgss_store() isn't free, and asking to
> generate a normalized query for a query that doesn't have any constant
> or storing the entry early won't do anything useful AFAICT.  Though if
> that's useful, you definitely can't remove the test without adapting
> the comment and the indentation.

See explanation in previous answer (comments have been updated accordingly)


> there are 4 tests to check if planning_time is zero or not, it's quite
> messy.  Could you refactor the code to avoid so many tests?  It would
> probably be useful to add some asserts to check that we don't provide
> both planning_time == 0 and execution related values.  The function's
> comment would also need to be adapted to mention the new rationale
> with planning_time.

Fixed


> * hash table entry for the PREPARE (with hash calculated from the query
> * string), and then a different one with the same query string (but hash
> * calculated from the query tree) would be used to accumulate costs of
> -* ensuing EXECUTEs.  This would be confusing, and inconsistent with other
> -* cases where planning time is not included at all.
> +* ensuing EXECUTEs.

> the comment about confusing behavior is still valid.

Fixed


>> Columns naming has not been modified, I would propose to change it to:
>>  - plans: ok
>>  - planning_time --> plan_time
>>  - calls: ok
>>  - total_time --> exec_time
>>  - {min,max,mean,stddev}_time: ok
>>  - new total_time (being the sum of plan_time and exec_time)

> plan_time and exec_time are accumulated counters, so we need to keep
> the total_ prefix in any case.  I think it's ok to break the function
> output names if we keep some kind of compatibility at the view level
> (which can keep total_time as the sum of total_plan_time and
> total_exec_time), so current queries against the view wouldn't break,
> and get what they probably wanted.

before to change this at all (view, function, code, doc) levels,
I would like to be sure that column names will be:
  - plans
  - total_plan_time
  - calls
  - total_exec_time
  - min_time (without exec in name)
  - max_time (without exec in name)
  - mean_time (without exec in name)
  - stddev_time (without exec in name)
  - total_time (being the sum of total_plan_time and total_exec_time)

could other users confirm ?




pgss_add_planning_counters_v2.diff
Description: pgss_add_planning_counters_v2.diff


RE: Planning counters in pg_stat_statements (using pgss_store)

2019-03-22 Thread legrand legrand
Hi,
Here is a rebased and corrected version .

Columns naming has not been modified, I would propose to change it to:
 - plans: ok
 - planning_time --> plan_time
 - calls: ok
 - total_time --> exec_time
 - {min,max,mean,stddev}_time: ok
 - new total_time (being the sum of plan_time and exec_time)

Regards
PAscal




pgss_add_planning_counters_v1.diff
Description: pgss_add_planning_counters_v1.diff


Re: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
Julien Rouhaud wrote
> On Wed, Mar 20, 2019 at 10:30 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> maybe this patch (with a GUC)
>> https://www.postgresql.org/message-id/

> 55E51C48.1060102@

>> would be enough for thoses actually using a text normalization function.
> 
> The rest of thread raise quite a lot of concerns about the semantics,
> the cost and the correctness of this patch.  After 5 minutes checking,
> it wouldn't suits your need if you use custom functions, custom types,
> custom operators (say using intarray extension) or if your tables
> don't have columns in the same order in every environment.  And there
> are probably other caveats that I didn't see;

Yes I know,
It would have to be extended at less at functions, types, operators ...
names
and a guc pg_stat_statements.queryid_based= 'names' (default being 'oids')

and with a second guc ('fullyqualifed' ?)
sould include tables, functions, types, operators ... namespaces

let "users" specify their needs, we will see ;o)



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
Julien Rouhaud wrote
> On Wed, Mar 20, 2019 at 10:18 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> On my personal point of view, I need to get the same Queryid between
>> (OLAP)
>> environments
>> to be able to compare Production, Pre-production, Qualif performances
>> (and I don't need Fully qualified relation names). Today to do that,
>> I'm using a custom extension computing the QueryId based on the
>> normalized
>> Query
>> text.
> 
> IIUC, your need is to compare pgss (maybe other extensions) counters
> from different primary servers, for queries generated by the same
> application(s).  A naive workaround like exporting each environment
> counters (COPY SELECT 'production', * FROM pg_stat_statements TO
> '...'), importing all of them on a server and then comparing
> everything using the query text (which should be the same if the
> application is the same) instead of queryid wouldn't work?  Or even
> using foreign tables if exporting data is too troublesome. That's
> clearly not ideal, but that's an easy workaround which doesn't add any
> performance hit at runtime.

Thank you Julien for the workaround,
It is not easy to build "cross tables" in excel to join metrics per query
text ...
and I'm not ready to build a MD5(query) as many query could lead to the same
QueryId
I've been using SQL_IDs for ten years, and I have some (who say old) habits
:^)

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
maybe this patch (with a GUC)
https://www.postgresql.org/message-id/55e51c48.1060...@uptime.jp
would be enough for thoses actually using a text normalization function.




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
Julien Rouhaud wrote
> On Wed, Mar 20, 2019 at 8:39 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> Yes, I would like first to understand what are the main needs,
> 
> I don't really see one implementation that suits every need, as
> probably not everyone will agree on using relation name vs fully
> qualified relation name for starter.  The idea to take into account or
> normalise comments will also probably require a lot of argumentation
> to reach a consensus.
> 
> Also, most of what's listed here would require catcache lookup for
> every objects impacted in a query, at every execution.  That would be
> *super* expensive (at least for OLTP workload).  As far as the need is
> to gather statistics like pg_stat_statements and similar extensions
> are doing, current queryid semantics and underlying limitations is not
> enough of a problem to justify paying that price IMHO.  Or do you have
> other needs and/or problems that really can't be solved with current
> implementation?
> 
> In other words, my first need would be to be able to deactivate
> everything that would make queryid computation significantly more
> expensive than it's today, and/or to be able to replace it with
> third-party implementation.
> 
>> >> needs.1: stable accross different databases,
>> [...]
>>
>> >needs.7: same value on both the primary and standby?
>>
>> I would say yes (I don't use standby) isn't this included into needs.1 ?
> 
> Physical replication servers have identical oids, so identical
> queryid.  That's obviously not true for logical replication.

On my personal point of view, I need to get the same Queryid between (OLAP)
environments
to be able to compare Production, Pre-production, Qualif performances 
(and I don't need Fully qualified relation names). Today to do that,
I'm using a custom extension computing the QueryId based on the normalized
Query 
text. 

This way to do, seems very popular and maybe including it in core (as a
dedicated extension) 
or proposing an alternate queryid (based on relation name) in PGSS (Guc
enabled) 
would fullfill 95% of the needs ...

I agree with you on the last point: being able to replace actual QueryId
with third-party 
implementation IS the first need.

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
> From "Kyotaro HORIGUCHI-2"
>>At Wed, 20 Mar 2019 00:23:30 +, "Tsunakawa, Takayuki" 
>>> From: legrand legrand [mailto:legrand_legrand@]
>>> norm.9: comments aware
>> Is this to distinguish queries that have different comments for optimizer
>> hints?  If yes, I agree.

> Or, any means to give an explict query id? I saw many instances
> of query that follows a comment describing a query id.

Yes, in fact didn't thought about different kink of comments, but all of
them


>> needs.3: search_path / schema independant,

>pg_stat_statements even ignores table/object/column names.

there is a very interesting thread about that in "pg_stat_statements and non
default search_path"
https://www.postgresql.org/message-id/8f54c609-17c6-945b-fe13-8b07c0866...@dalibo.com

expecting distinct QueryIds when using distinct schemas ...
maybe that It should be switched to "Schema dependant"



>> needs.4: pg version independant (as long as possible),

>I don't think this cannot be guaranteed.

maybe using a QueryId versioning GUC 
 

>> norm.1: case insensitive
>> norm.2: blank reduction 
>> norm.3: hash algoritm ?
>> norm.4: CURRENT_DATE, CURRENT_TIME, LOCALTIME, LOCALTIMESTAMP not
>> normalized
>> norm.5: NULL, IS NULL not normalized ?
>> norm.6: booleans t, f, true, false not normalized
>> norm.7: order by 1,2 or group by 1,2 should not be normalized
>> norm.8: pl/pgsql anonymous blocks not normalized

> pg_stat_statements can be the base of the discussion on them.

OK

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



RE: [survey] New "Stable" QueryId based on normalized query text

2019-03-20 Thread legrand legrand
> From: "Tsunakawa, Takayuki"
>> From: legrand legrand [mailto:legrand_legrand@]
>> There are many projects that use alternate QueryId
>> distinct from the famous pg_stat_statements jumbling algorithm.

>I'd like to welcome the standard QueryID that DBAs and extension developers
can depend on.  
>Are you surveying the needs for you to develop the QueryID that can meet as
many needs as possible?

Yes, I would like first to understand what are the main needs, 
then identify how it would be possible to implement it 
in core, in a new extension or simply with a modified pg_stat_statements.
(I'm just a DBA not a C developer, so it will only be restricted to very
simple enhancements)


>> needs.1: stable accross different databases,

>Does this mean different database clusters, not different databases in a
single database cluster?

Same looking query should give same QueryId on any database (in the same
cluster or in distinct clusters). It can be differentiated with dbid.


>needs.5: minimal overhead to calculate

OK will add it


>needs.6: doesn't change across database server restarts

Really ? does this already occurs ?


>needs.7: same value on both the primary and standby?

I would say yes (I don't use standby) isn't this included into needs.1 ?


>> norm.9: comments aware

>Is this to distinguish queries that have different comments for optimizer
hints?  If yes, I agree.

Yes and others like playing with : 
set ...
select /* test 1*/ ...

set ... 
select /* test 2*/ ...
  




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-20 Thread legrand legrand
Hi Jim, Robert,

As this is a distinct subject from adding QueryId to pg_stat_activity,
would it be possible to continue the discussion "new QueryId definition" 
(for postgres open source software) here:

https://www.postgresql.org/message-id/1553029215728-0.p...@n3.nabble.com

Thanks in advance.
Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



[survey] New "Stable" QueryId based on normalized query text

2019-03-19 Thread legrand legrand
Hello,

There are many projects that use alternate QueryId 
distinct from the famous pg_stat_statements jumbling algorithm.

https://github.com/postgrespro/aqo
query_hash

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Optimize.ViewPlans.html
sql_hash

https://github.com/ossc-db/pg_hint_plan
queryid

Even pg_stat_statement has a normalize function, 
that would answer the current question ...

Here are some *needs* :

needs.1: stable accross different databases,
needs.2: doesn't change after database or object rebuild,
needs.3: search_path / schema independant,
needs.4: pg version independant (as long as possible),
...

and some *normalization rules*:

norm.1: case insensitive
norm.2: blank reduction 
norm.3: hash algoritm ?
norm.4: CURRENT_DATE, CURRENT_TIME, LOCALTIME, LOCALTIMESTAMP not normalized
norm.5: NULL, IS NULL not normalized ?
norm.6: booleans t, f, true, false not normalized
norm.7: order by 1,2 or group by 1,2 should not be normalized
norm.8: pl/pgsql anonymous blocks not normalized
norm.9: comments aware

Do not hesitate to add your thougths
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-19 Thread legrand legrand
Great, 
thank you Julien !

Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-16 Thread legrand legrand
Hello,

This is available in https://github.com/legrandlegrand/pg_stat_sql_plans
extension with a specific function
pgssp_backend_queryid(pid) that permits to join pg_stat_activity with
pg_stat_sql_plans (that is similar to pg_stat_statements) and also permits
to collect samples of wait events per query id.

This extension computes its own queryid based on a normalized query text
(that doesn't change after table
drop/create).

Maybe that queryid calculation should stay in a dedicated extension,
permiting to users to choose their queryid definition.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: any plan to support shared servers like Oracle in PG?

2019-03-05 Thread legrand legrand
There already are solutions regarding this feature in Postgres
 using "connection pooler" wording

see 

pgpool: http://www.pgpool.net/mediawiki/index.php/Main_Page

pgbouncer: https://pgbouncer.github.io/

there are also discussions to include this as a core feature

https://www.postgresql.org/message-id/flat/4b971a8f-ff61-40eb-8f30-7b57eb0fdf9d%40postgrespro.ru



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Planning counters in pg_stat_statements (using pgss_store)

2019-02-14 Thread legrand legrand
Thank you Sergei for your comments,

> Did you register patch in CF app? I did not found entry.
created today: https://commitfest.postgresql.org/22/1999/

> Currently we have pg_stat_statements 1.7 version and this patch does not
> apply... 
will rebase and create a 1.8 version

> -  errmsg("could not read file \"%s\": %m",
> +  errmsg("could not read pg_stat_statement file \"%s\": 
> %m",
this is a mistake, will fix

> +#define PG_STAT_STATEMENTS_COLS_V1_4 25
I thought it was needed when adding new columns, isn't it ?

> And this patch does not have documentation changes.
will fix

and will provide some kind of benchmark to compare with actual version.

Regards
PAscal 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Planning counters in pg_stat_statements (using pgss_store)

2019-02-12 Thread legrand legrand
Hi Sergei,

Thank you for this review !

>Did you register patch in CF app? I did not found entry. 
I think it is related to https://commitfest.postgresql.org/16/1373/
but I don't know how to link it with.

Yes, there are many things to improve, but before to go deeper, 
I would like to know if that way to do it (with 3 access to pgss hash)
has a chance to get consensus  ?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: explain plans with information about (modified) gucs

2019-01-14 Thread legrand legrand
Tomas Vondra-4 wrote
> Hello Sergei,
> 
>> This patch correlates with my proposal
>> "add session information column to pg_stat_statements"
>> https://www.postgresql.org/message-id/3aa097d7-7c47-187b-5913-db8366cd4491%40gmail.com
>> They both address the problem to identify the factors that make
>> different execution plans for the same SQL statements. You are
>> interested in the current settings that affect the execution plan, I'm
>> concerned about historical data in pg_stat_statements. From my
>> experience the most often offending settings are
>> current_schemas/search_path and current_user. Please have in mind that
>> probably the same approach that you will use to extend explain plan
>> functionality will be eventually implemented to extend
>> pg_stat_statements.
> 
> Possibly, although I don't have an ambition to export the GUCs into
> pg_stat_statements in this patch. There's an issue with merging
> different values of GUCs in different executions of a statement, and
> it's unclear how to solve that.
> 
> [...]
> 
> -- 
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

This explain with GUCs feature can be included very easily for historical
data management in pg_store_plans or pg_stat_sql_plans extensions 
(that both use a planid based on the normalized plan text).

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Planning counters in pg_stat_statements (using pgss_store)

2018-12-29 Thread legrand legrand
Starting from
https://www.postgresql.org/message-id/CAEepm%3D2vORBhWQZ1DJmKXmCVi%2B15Tgrv%2B9brHLanWU7XE_FWxQ%40mail.gmail.com

Here is a patch trying to implement what was proposed by Tom Lane:

"What we could/should do instead, I think, is have pgss_planner_hook
make its own pgss_store() call to log the planning time results
(which would mean we don't need the added PlannedStmt field at all).
That would increase the overhead of this feature, which might mean
that it'd be worth having a pg_stat_statements GUC to enable it.
But it'd be a whole lot cleaner."

Now:
pgss_post_parse_analyze, initialize pgss entry with sql text,
pgss_planner_hook, adds planning_time and counts,
pgss_ExecutorEnd, works unchanged.

but doesn't include any pg_stat_statements GUC to enable it yet.

note: I didn't catch the sentence "which would mean we don't need the added 
PlannedStmt field at all".


Regarding initial remark from Thomas Munro:

"I agree with the sentiment on the old thread that
{total,min,max,mean,stddev}_time now seem badly named, but adding
execution makes them so long...  Thoughts?"

What would you think about:
- userid
- dbid
- queryid
- query
- plans
- plan_time
- {min,max,mean,stddev}_plan_time
- calls
- exec_time
- {min,max,mean,stddev}_exec_time
- total_time (being the sum of plan_time and exec_time)
- rows
- ...

Regards
PAscal


pgss_add_planning_counters.diff
Description: pgss_add_planning_counters.diff


  1   2   >