Re: Display individual query in pg_stat_activity

2020-12-01 Thread Georgios Kokolatos
This patch fails in the cfbot for quite some time now.
I shall close it as Return With Feedback and not move it to the next CF.
Please feel free to register an updated version afresh for the next CF.

Cheers,
//Georgios

Re: Display individual query in pg_stat_activity

2020-11-10 Thread Georgios Kokolatos
Hi,

I noticed that this patch is failing on the cfbot.
For this, I changed the status to: 'Waiting on Author'

Cheers,
Georgios

The new status of this patch is: Waiting on Author


Re: Display individual query in pg_stat_activity

2020-09-23 Thread Michael Paquier
On Thu, Sep 10, 2020 at 04:06:17PM +0200, Drouvot, Bertrand wrote:
> Attaching a new version as the previous one was not passing the Patch Tester
> anymore.

Ditto, the CF bot is complaining again.  Could you send a rebase?
--
Michael


signature.asc
Description: PGP signature


Re: Display individual query in pg_stat_activity

2020-08-28 Thread Pavel Stehule
pá 28. 8. 2020 v 10:06 odesílatel Masahiro Ikeda 
napsal:

> On 2020-08-19 14:48, Drouvot, Bertrand wrote:
> > Hi,
> > On 8/18/20 9:35 AM, Pavel Stehule wrote:
> >
> >> Hi
> >>
> >> út 18. 8. 2020 v 8:54 odesílatel Masahiro Ikeda
> >>  napsal:
> >>
> >>> Hi,
> >>>
>  I've attached a patch to display individual query in the
>  pg_stat_activity query field when multiple SQL statements are
>  currently displayed.
> 
>  Motivation:
> 
>  When multiple statements are displayed then we don’t know
> >>> which
>  one is currently running.
> 
>  I'm not sure I'd want that to happen, as it could make it much
>  harder to track the activity back to a query in the application
>  layer or server logs.
> 
>  Perhaps a separate field could be added for the current
> >>> statement,
>  or a value to indicate what the current statement number in the
>  query is?
> >>>
> >>> As a user, I think this feature is useful to users.
> >>>
> >>> It would be nice that pg_stat_activity also show currently running
> >>> query
> >>> in a user defined function(PL/pgSQL) .
> >>>
> >>> I understood that this patch is not for user defined functions.
> >>> Please let me know if it's better to make another thread.
> >
> > Yeah I think it would be nice to have.
> >
> > I also think it would be better to create a dedicated thread
> > (specially looking at Pavel's comment below)
>
> Thank you. I will.
>
> >>> In general, PL/pgSQL functions have multiple queries,
> >>> and users want to know the progress of query execution, doesn't
> >>> it?
> >>
> >> I am afraid of the significant performance impact of this feature.
> >> In this case you have to copy all nested queries to the stat
> >> collector process. Very common usage of PL is a glue of very fast
> >> queries. Sure, it is used like glue for very slow queries too.
> >> Just I thinking about two features:
>
> OK, thanks for much advice and show alternative solutions.
>
> >> 1. extra interface for auto_explain, that allows you to get a stack
> >> of statements assigned to some pid (probably these informations
> >> should be stored inside shared memory and collected before any query
> >> execution). Sometimes some slow function is slow due repeated
> >> execution of relatively fast queries. In this case, the deeper
> >> nested level is not too interesting. You need to see a stack of
> >> calls and you are searching the first slow level in the stack.
>
> Thanks. I didn't know auto_explain module.
> I agreed when only requested, it copy the stack of statements.
>
> >> 2. can be nice to have a status column in pg_stat_activity, and
> >> status GUC for sending a custom information from deep levels to the
> >> user. Now, users use application_name, but some special variables
> >> can be better for this purpose.  This value of status can be
> >> refreshed periodically and can substitute some tags. So developer
> >> can set
> >>
> >> BEGIN
> >> -- before slow long query
> >> SET status TO 'slow query calculation xxy %d';
> >> ...
> >>
> >> It is a alternative to RAISE NOTICE, but with different format -
> >> with format that is special for reading from pg_stat_activity
> >>
> >> For long (slow) queries usually you need to see the sum of all times
> >> of all levels from the call stack to get valuable information.
>
> In comparison to 1, user must implements logging statement to
> their query but user can control what he/she wants to know.
>
> I worry which solution is best.
>

There is no best solution - @1 doesn't need manual work, but @1 is not too
useful when queries are similar (first n chars) and are long. In this case
custom messages are much more practical.

I don't think so we can implement only one design - in this case we can
support more tools with similar purpose but different behaviors in corner
cases.


> >> p.s. pg_stat_activity is maybe too wide table already, and probably
> >> is not good to enhance this table too much
>
> Thanks. I couldn't think from this point of view.
>
> After I make some PoC patches, I will create a dedicated thread.
>
> Regards,
> --
> Masahiro Ikeda
> NTT DATA CORPORATION
>


Re: Display individual query in pg_stat_activity

2020-08-28 Thread Masahiro Ikeda

On 2020-08-19 14:48, Drouvot, Bertrand wrote:

Hi,
On 8/18/20 9:35 AM, Pavel Stehule wrote:


Hi

út 18. 8. 2020 v 8:54 odesílatel Masahiro Ikeda
 napsal:


Hi,


I've attached a patch to display individual query in the
pg_stat_activity query field when multiple SQL statements are
currently displayed.

Motivation:

When multiple statements are displayed then we don’t know

which

one is currently running.

I'm not sure I'd want that to happen, as it could make it much
harder to track the activity back to a query in the application
layer or server logs.

Perhaps a separate field could be added for the current

statement,

or a value to indicate what the current statement number in the
query is?


As a user, I think this feature is useful to users.

It would be nice that pg_stat_activity also show currently running
query
in a user defined function(PL/pgSQL) .

I understood that this patch is not for user defined functions.
Please let me know if it's better to make another thread.


Yeah I think it would be nice to have.

I also think it would be better to create a dedicated thread
(specially looking at Pavel's comment below)


Thank you. I will.


In general, PL/pgSQL functions have multiple queries,
and users want to know the progress of query execution, doesn't
it?


I am afraid of the significant performance impact of this feature.
In this case you have to copy all nested queries to the stat
collector process. Very common usage of PL is a glue of very fast
queries. Sure, it is used like glue for very slow queries too.
Just I thinking about two features:


OK, thanks for much advice and show alternative solutions.


1. extra interface for auto_explain, that allows you to get a stack
of statements assigned to some pid (probably these informations
should be stored inside shared memory and collected before any query
execution). Sometimes some slow function is slow due repeated
execution of relatively fast queries. In this case, the deeper
nested level is not too interesting. You need to see a stack of
calls and you are searching the first slow level in the stack.


Thanks. I didn't know auto_explain module.
I agreed when only requested, it copy the stack of statements.


2. can be nice to have a status column in pg_stat_activity, and
status GUC for sending a custom information from deep levels to the
user. Now, users use application_name, but some special variables
can be better for this purpose.  This value of status can be
refreshed periodically and can substitute some tags. So developer
can set

BEGIN
-- before slow long query
SET status TO 'slow query calculation xxy %d';
...

It is a alternative to RAISE NOTICE, but with different format -
with format that is special for reading from pg_stat_activity

For long (slow) queries usually you need to see the sum of all times
of all levels from the call stack to get valuable information.


In comparison to 1, user must implements logging statement to
their query but user can control what he/she wants to know.

I worry which solution is best.


p.s. pg_stat_activity is maybe too wide table already, and probably
is not good to enhance this table too much


Thanks. I couldn't think from this point of view.

After I make some PoC patches, I will create a dedicated thread.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION




Re: Display individual query in pg_stat_activity

2020-08-18 Thread Pavel Stehule
Hi

út 18. 8. 2020 v 8:54 odesílatel Masahiro Ikeda 
napsal:

> Hi,
>
> > I've attached a patch to display individual query in the
> > pg_stat_activity query field when multiple SQL statements are
> > currently displayed.
> >
> > Motivation:
> >
> > When multiple statements are displayed then we don’t know which
> > one is currently running.
> >
> > I'm not sure I'd want that to happen, as it could make it much
> > harder to track the activity back to a query in the application
> > layer or server logs.
> >
> > Perhaps a separate field could be added for the current statement,
> > or a value to indicate what the current statement number in the
> > query is?
>
> As a user, I think this feature is useful to users.
>
> It would be nice that pg_stat_activity also show currently running query
> in a user defined function(PL/pgSQL) .
>
> I understood that this patch is not for user defined functions.
> Please let me know if it's better to make another thread.
>
> In general, PL/pgSQL functions have multiple queries,
> and users want to know the progress of query execution, doesn't it?
>

I am afraid of the significant performance impact of this feature. In this
case you have to copy all nested queries to the stat collector process.
Very common usage of PL is a glue of very fast queries. Sure, it is used
like glue for very slow queries too.

Just I thinking about two features:

1. extra interface for auto_explain, that allows you to get a stack of
statements assigned to some pid (probably these informations should be
stored inside shared memory and collected before any query execution).
Sometimes some slow function is slow due repeated execution of relatively
fast queries. In this case, the deeper nested level is not too interesting.
You need to see a stack of calls and you are searching the first slow level
in the stack.

2. can be nice to have a status column in pg_stat_activity, and status GUC
for sending a custom information from deep levels to the user. Now, users
use application_name, but some special variables can be better for this
purpose.  This value of status can be refreshed periodically and can
substitute some tags. So developer can set

BEGIN
  -- before slow long query
  SET status TO 'slow query calculation xxy %d';
 ...

It is a alternative to RAISE NOTICE, but with different format - with
format that is special for reading from pg_stat_activity

For long (slow) queries usually you need to see the sum of all times of all
levels from the call stack to get valuable information.

Regards

Pavel

p.s. pg_stat_activity is maybe too wide table already, and probably is not
good to enhance this table too much



> --
> Masahiro Ikeda
> NTT DATA CORPORATION
>
>
>


Re: Display individual query in pg_stat_activity

2020-08-17 Thread Masahiro Ikeda

Hi,


I've attached a patch to display individual query in the
pg_stat_activity query field when multiple SQL statements are
currently displayed.

Motivation:

When multiple statements are displayed then we don’t know which
one is currently running.

I'm not sure I'd want that to happen, as it could make it much
harder to track the activity back to a query in the application
layer or server logs.

Perhaps a separate field could be added for the current statement,
or a value to indicate what the current statement number in the
query is?


As a user, I think this feature is useful to users.

It would be nice that pg_stat_activity also show currently running query
in a user defined function(PL/pgSQL) .

I understood that this patch is not for user defined functions.
Please let me know if it's better to make another thread.

In general, PL/pgSQL functions have multiple queries,
and users want to know the progress of query execution, doesn't it?

--
Masahiro Ikeda
NTT DATA CORPORATION




Re: Display individual query in pg_stat_activity

2020-08-06 Thread Magnus Hagander
On Thu, Aug 6, 2020 at 12:17 PM Drouvot, Bertrand 
wrote:

> Hi,
> On 7/27/20 4:57 PM, Dave Page wrote:
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
> Hi
>
> On Mon, Jul 27, 2020 at 3:40 PM Drouvot, Bertrand 
> wrote:
>
>> Hi hackers,
>>
>> I've attached a patch to display individual query in the pg_stat_activity
>> query field when multiple SQL statements are currently displayed.
>>
>> *Motivation:*
>>
>> When multiple statements are displayed then we don’t know which one is
>> currently running.
>>
>
> I'm not sure I'd want that to happen, as it could make it much harder to
> track the activity back to a query in the application layer or server logs.
>
> Perhaps a separate field could be added for the current statement, or a
> value to indicate what the current statement number in the query is?
>
> Thanks for he feedback.
>
> I like the idea of adding extra information without changing the current
> behavior.
>
> A value to indicate what the current statement number is, would need
> parsing the query field by the user to get the individual statement.
>
> I think the separate field makes sense (though it come with an extra
> memory price) as it will not change the existing behavior and would just
> provide extra information (without any extra parsing needed for the user).
>
>
>
Idle though without having considered it too much -- you might reduce the
memory overhead by just storing a start/end offset into the combined query
string instead of a copy of the query. That way the cost would only be paid
when doing the reading of pg_stat_activity (by extracting the piece of the
string), which I'd argue is done orders of magnitude fewer times than the
query changes at least on busy systems. Care would have to be taken for the
case of the current executing query actually being entirely past the end of
the query string buffer of course, but I don't think that's too hard to
define a useful behaviour for. (The user interface would stay the same,
showing the actual string and thus not requiring the user to do any parsing)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Display individual query in pg_stat_activity

2020-07-27 Thread Dave Page
On Mon, Jul 27, 2020 at 4:28 PM Jeremy Schneider 
wrote:

> On 7/27/20 07:57, Dave Page wrote:
>
> I'm not sure I'd want that to happen, as it could make it much harder to
> track the activity back to a query in the application layer or server logs.
>
> Perhaps a separate field could be added for the current statement, or a
> value to indicate what the current statement number in the query is?
>
>
> Might be helpful to give some specifics about circumstances where strings
> can appear in pg_stat_activity.query with multiple statements.
>
> 1) First of all, IIUC multiple statements are only supported in the first
> place by the simple protocol and PLs.  Anyone using parameterized
> statements (bind variables) should be unaffected by this.
>
> 2) My read of the official pg JDBC driver is that even for batch
> operations it currently iterates and sends each statement individually. I
> don't think the JDBC driver has the capability to send multiple statements,
> so java apps using this driver should be unaffected.
>

That is just one of a number of different popular drivers of course.


>
> 3) psql -c will always send the string as a single "simple protocol"
> request.  Scripts will be impacted.
>
> 4) PLs also seem to have a code path that can put multiple statements in
> pg_stat_activity when parallel slaves are launched.  PL code will be
> impacted.
>
> 5) pgAdmin uses the simple protocol and when a user executes a block of
> statements, pgAdmin seems to send the whole block as a single "simple
> protocol" request.  Tools like pgAdmin will be impacted.
>

It does. It also prepends some queries with comments, specifically to allow
users to filter them out when they're analysing logs (a feature requested
by users, not just something we thought was a good idea). I'm assuming that
this patch would also strip those?


>
> At the application layer, it doesn't seem problematic to me if PostgreSQL
> reports each query one at a time.  IMO most people will find this to be a
> more useful behavior and they will still find their queries in their app
> code or app logs.
>

I think there are arguments to be made for both approaches.


>
> However at the PostgreSQL logging layer this is a good call-out.  I just
> did a quick test on 14devel to double-check my assumption and it does seem
> that PostgreSQL logs the entire combined query for psql -c.  I think it
> would be better for PostgreSQL to report queries individually in the log
> too - for example pgBadger summaries will be even more useful if they
> report information for each individual query rather than a single big block
> of multiple queries.
>
> Given how small this patch is, it seems worthwhile to at least investigate
> whether the logging component could be addressed just as easily.
>
> -Jeremy
>
> --
> Jeremy Schneider
> Database Engineer
> Amazon Web Services
>
>
>

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


Re: Display individual query in pg_stat_activity

2020-07-27 Thread Dave Page
Hi

On Mon, Jul 27, 2020 at 3:40 PM Drouvot, Bertrand 
wrote:

> Hi hackers,
>
> I've attached a patch to display individual query in the pg_stat_activity
> query field when multiple SQL statements are currently displayed.
>
> *Motivation:*
>
> When multiple statements are displayed then we don’t know which one is
> currently running.
>

I'm not sure I'd want that to happen, as it could make it much harder to
track the activity back to a query in the application layer or server logs.

Perhaps a separate field could be added for the current statement, or a
value to indicate what the current statement number in the query is?

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com