Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2017-02-14 Thread Robert Haas
On Sun, Jul 17, 2016 at 7:15 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
>>> If we have timestamp of first and last executed, we can easily gather thess
>>> informations and there are tons of more use cases.
>
>> -1 from me.
>
>> I think that this is the job of a tool that aggregates things from
>> pg_stat_statements. It's unfortunate that there isn't a good
>> third-party tool that does that, but there is nothing that prevents
>> it.
>
> The concern I've got about this proposal is that the results get very
> questionable as soon as we start dropping statement entries for lack
> of space.  last_executed would be okay, perhaps, but first_executed
> not so much.

ISTM that last_executed is useful - you can then see for yourself
which of the statements that you see in the pg_stat_statements output
have been issued recently, and which are older.  I mean, you could
also do that, as Peter says, with an additional tool that takes
periodic snapshots of the data and then figures out an approximate
last_executed time, but if you had this, you wouldn't need an
additional tool, at least not for simple cases.  Better yet, the data
would be more exact.  I dunno what's not to like about that, unless
we're worried that tracking it will incur too much overhead.

first_executed doesn't seem as useful as last_executed, but it isn't
useless either.  It can't properly be read as the first time that
statement was ever executed, but it can be properly read as the amount
of time that has passed during which that statement has been executed
frequently enough to stay in the hash table, which is something that
someone might want to know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-18 Thread Jason Kim
Hi guys,
Thank you for feedbacks.

> I think that this is the job of a tool that aggregates things from 
> pg_stat_statements. It's unfortunate that there isn't a good 
> third-party tool that does that, but there is nothing that prevents 
> it. 

Right. We can do this if we aggregate it frequently enough. However,
third-parties can do it better if we have more informations.
I think these are fundamental informations to strong third-parties could be.

> The concern I've got about this proposal is that the results get very 
> questionable as soon as we start dropping statement entries for lack 
> of space.  last_executed would be okay, perhaps, but first_executed 
> not so much. 

If we set pg_stat_statements.max to large enough, there could be long
lived entries and short lived ones simultaneously. In this case, per call 
statistics could be accurate but per seconds stats can not.
The idea of I named it as created and last_updated (not
first_executed and last_executed) was this. It represents timestamp of
 stats are created and updated, so we can calculate per second stats with
simple math.

> Also, for what it's worth, I should point out to Jun that 
> GetCurrentTimestamp() should definitely not be called when a spinlock 
> is held like that. 

I moved it outside of spinlock.

@@ -1204,6 +1209,11 @@ pgss_store(const char *query, uint32 queryId,
 */
volatile pgssEntry *e = (volatile pgssEntry *) entry;

+   /*
+* Read a timestamp before grab the spinlock
+*/
+   TimestampTz last_updated = GetCurrentTimestamp();
+
SpinLockAcquire(>mutex);

/* "Unstick" entry if it was previously sticky */
@@ -1251,6 +1261,7 @@ pgss_store(const char *query, uint32 queryId,
e->counters.blk_read_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
e->counters.blk_write_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
e->counters.usage += USAGE_EXEC(total_time);
+   e->counters.last_updated = last_updated;

SpinLockRelease(>mutex);
  }

pg_stat_statements_with_timestamp_v2.patch

  

Regards,
Jason Kim.



--
View this message in context: 
http://postgresql.nabble.com/PROPOSAL-timestamp-informations-to-pg-stat-statements-tp5912306p5912461.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Julien Rouhaud
On 18/07/2016 01:06, Peter Geoghegan wrote:
> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
>> If we have timestamp of first and last executed, we can easily gather thess
>> informations and there are tons of more use cases.
> 
> -1 from me.
> 
> I think that this is the job of a tool that aggregates things from
> pg_stat_statements. It's unfortunate that there isn't a good
> third-party tool that does that, but there is nothing that prevents
> it.
> 

FWIW there's https://github.com/dalibo/powa-archivist which does that,
assuming you're using a 9.4+ server.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
On Sun, Jul 17, 2016 at 4:15 PM, Tom Lane  wrote:
> The concern I've got about this proposal is that the results get very
> questionable as soon as we start dropping statement entries for lack
> of space.  last_executed would be okay, perhaps, but first_executed
> not so much.

Agreed.

Also, for what it's worth, I should point out to Jun that
GetCurrentTimestamp() should definitely not be called when a spinlock
is held like that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Tom Lane
Peter Geoghegan  writes:
> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
>> If we have timestamp of first and last executed, we can easily gather thess
>> informations and there are tons of more use cases.

> -1 from me.

> I think that this is the job of a tool that aggregates things from
> pg_stat_statements. It's unfortunate that there isn't a good
> third-party tool that does that, but there is nothing that prevents
> it.

The concern I've got about this proposal is that the results get very
questionable as soon as we start dropping statement entries for lack
of space.  last_executed would be okay, perhaps, but first_executed
not so much.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
> If we have timestamp of first and last executed, we can easily gather thess
> informations and there are tons of more use cases.

-1 from me.

I think that this is the job of a tool that aggregates things from
pg_stat_statements. It's unfortunate that there isn't a good
third-party tool that does that, but there is nothing that prevents
it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Jun Cheol Gim
Hi hackers!

Following is a proposal to add timestamp informations to
`pg_stat_statements`.

# Use case
- If we want to gather list and stats for queries executed at least once
last 1 hour, we had to reset a hours ago. There is no way if we didn't.
- If we found some strange query from `pg_stat_statments`, we might want to
identify when it ran firstly.

If we have timestamp of first and last executed, we can easily gather thess
informations and there are tons of more use cases.

# Implementations
Firstly, I added API version 1.5 to add additional fields and I added two
fields to Counters structure. Now it has 25 fields in total.

```
@@ -156,6 +158,8 @@ typedef struct Counters
  doubleblk_read_time;  /* time spent reading, in msec */
  doubleblk_write_time; /* time spent writing, in msec */
  doubleusage;  /* usage factor */
+ TimestampTz   created;  /* timestamp of created time */
+ TimestampTz   last_updated; /* timestamp of last updated */
 } Counters;

 /*
```

The `created` field is filled at the first time the entry will added to
hash table.

```
@@ -1663,6 +1690,8 @@ entry_alloc(pgssHashKey *key, Size query_offset, int
query_len, int encoding,

/* reset the statistics */
memset(>counters, 0, sizeof(Counters));
+   /* set the created timestamp */
+entry->counters.created = GetCurrentTimestamp();
/* set the appropriate initial usage count */
entry->counters.usage = sticky ? pgss->cur_median_usage : USAGE_INIT;
/* re-initialize the mutex each time ... we assume no one using it */
```

The `last_updated` will be updated every time `pgss_store()` updates stats.

```
@@ -1251,6 +1256,7 @@ pgss_store(const char *query, uint32 queryId,
e->counters.blk_read_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
e->counters.blk_write_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
e->counters.usage += USAGE_EXEC(total_time);
+   e->counters.last_updated = GetCurrentTimestamp();

SpinLockRelease(>mutex);
  }
```

The attached is my first implementation.

Regards,
Jason Kim.


pg_stat_statements_with_timestamp_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers