> > I took a look at this today out of interest,
>
> Thanks!
>
> > so, instead of calling IS_INJECTION_POINT_ATTACHED macro which is
>
> I think that not calling IS_INJECTION_POINT_ATTACHED() but only relying on
> "ifdef USE_INJECTION_POINTS" would set the tiny timeout value to the entire 
> test
> suit.

Yes, you are right about that.

To Michael's earlier comments:

> I don't find the design of this patch appealing, and my mind points
> towards two pieces of it:
> 1) The new requirement related to pgstat_schedule_anytime_update()
> that a stats kind needs to call to enable a timeout.  This partially
> doubles with pgstat_report_fixed.  And I suspect that this extra set
> of requirements, introducing a new level of complexity for in-core
> stats kinds as well as extension developers, would be the source of
> more bugs.

I don't see how for fixed stats, adding a
pgstat_schedule_anytime_update() call for such kinds
will be too complex or more error prone. it looks quite straightforward
as this is done before pgstat_report_fixed is set to true. Also,
because processes like wal-sender can loop forever, as mentioned
here [1], a timeout at stats_flush_interval seems like a
straightforward way to deal with this problem.

For variable-length statistics, perhaps we can do things a bit
differently than what is currently proposed. 0005 requires
a relation anytime stat update to call
pgstat_schedule_anytime_update(). This is done this way because
it allows long-running queries to update their stats every
stats_flush_interval using a timeout.

But maybe what we should be doing for variable-numbered stats is
to schedule an anytime update whenever a "transaction goes idle".
This way, unlike the current state of things where we are only
updating relation stats at the end of a transaction, we are now
updating relation stats at the end of SQL execution, and within a
transaction.

Basically, we will continue scheduling anytime
updates every stats_flush_interval for fixed stats (xlog,
bgwriter, etc.), but for variable stats, we only update anytime
stats after SQL execution. This is better than what we have now,
where stats are only updated at the end of the transaction.

The timeout will only be needed to schedule an update
for fixed stats. For variable stats, we can use
GetCurrentStatementStartTimestamp, which is the timestamp of the
last query executed, to throttle pgstat_flush_pending_entries().
We can also flush variable number stats whenever we flush
fixed number stats, in case we enter a long
idle-in-transaction state after a few quick back-to-back queries.

> 2) The timeout requirement itself, relying on a timeout threshold
> controlled by a backend-side configuration.

Perhaps we may not need the stats_flush_inteval and just force
a 10 second timeout for fixed stats.


> With that in mind, wouldn't it be simpler if we introduced an API that
> could be used from client applications instead, in a model similar
> what we do for procsignal.c/h?  One such example is
> LOG_MEMORY_CONTEXT, where we have a SQL function that is able to tell
> to a backend that it needs to do something.  I could see various
> benefits to this approach, because it gives more flexibility with the
> timing of the stats flushes, which may not be a backend-side only
> policy:
> - Use a cron bgworker in the backend, that scans pg_stat_activity, for
> example for long-running transactions based on a threshold.
> - Do the same periodic scan of pg_stat_activity, but from a client
> application.

I find it odd to ask applications/clients to trigger a flush. I am not saying
that we should not offer such an API, especially if someone want to flush
stats more frequently than stats_flush_interval, but there should be
the ability for core to handle this automatically outside of the transaction
boundaries.

One comment about the current test. I think there is a bug that was
missed in the earlier review. For the var_anytime_update, we need to
have an escape before the pipe. Also, we should set
stats_fetch_consistency=none in that test.

-like($result, qr/^entry2|2|/m,
+like($result, qr/^entry2\|2\|/m,

Otherwise, the test is returning a false-positive.

[1] 
https://www.postgresql.org/message-id/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4pw7

--
Sami Imseih
Amazon Web Services (AWS)


Reply via email to