> > WHERE last_execution_start + max_exec_time * INTERVAL '1 ms' > NOW() -
> > polling_interval
>
> Is this extra complexity worth one saved GetCurrentTimestamp()?
>
> src/backend/access/transam/xact.c is calling GetCurrentTimestamp a
> lot already, so I don't really buy the argument it should be avoided
> at all cost in pg_stat_statements. Just storing the statement end time
> would make this use case much nicer.
We do call GetCurrentTimestamp() to set timestamps for xact_start, etc.
But, we also take special care to avoid extra calls.
```
/*
* set transaction_timestamp() (a/k/a now()). Normally, we want this to
* be the same as the first command's statement_timestamp(), so don't do a
* fresh GetCurrentTimestamp() call (which'd be expensive anyway). But
* for transactions started inside procedures (i.e., nonatomic SPI
* contexts), we do need to advance the timestamp. Also, in a parallel
* worker, the timestamp should already have been provided by a call to
* SetParallelStartTimestamps().
*/
if (!IsParallelWorker())
{
if (!SPI_inside_nonatomic_context())
xactStartTimestamp = stmtStartTimestamp;
else
xactStartTimestamp = GetCurrentTimestamp();
```
> OK, here is one more try. I discovered the `total_time` argument to
> the `pgss_store()` function! So we can calculate the finish time
> without calling `GetCurrentTimestamp()`.
>
> This is version 3 of the patch adding a `stats_last_updated` column
> (yes, again) to pg_stat_statements. Based on feedback, this version
> improves the implementation with better performance and correctness.
>
> The main improvement uses `statement_start + execution_duration` with
> `rint(total_time * 1000.0)` to convert milliseconds to microseconds
> with proper rounding. The calculation performed BEFORE acquiring
> spinlock and assigned within locked scope.
Correct, this also crossed my mind. Although I would consider doing things a bit
different. Instead of calculating the end time in a single column, I still
think it's worth having a last_executed_start and a
last_execution_time (duration),
and the user can calculate this on the SQL layer. I think it's better because
last_execution_start is already a known timestamp in
pg_stat_activity.query_start
and some tool that finds a long running query in pg_stat_activity, knowing the
query_start they could then go look it up in pg_stat_statements. What I'm
really getting at is separating these fields will open up more use cases, IMO.
Of course, we are adding 2 new columns instead of just 1, but these values
do have benefits.
--
Sami Imseih
Amazon Web Services (AWS)