Thanks for the updates!

> > > I don't think this feature could add a noticeable performance impact, so 
> > > the tests
> > > have been that simple. Do you think we should worry more?
> >
> > One observation is there's no coordination between ANYTIME and
> > TXN_BOUNDARY flushes. While PGSTAT_MIN_INTERVAL
> > prevents a backend from flushing more than once per second, a backend can
> > still perform both an ANYTIME flush and a TXN_BOUNDARY flush within
> > the same 1-second window. Not saying this will be a real problem in
> > the real-world,
> > but we definitely took measures in the current implementation to avoid
> > this scenario.
>
> Right. I think that the PGSTAT_MIN_INTERVAL throttling was put in place to 
> prevent
> flushing too frequently when the backend has a high commit rate. But here, 
> while
> it's true that we don't follow that rule (means a backend could flush more 
> than one
> time per second), that would be a maximum of 2 times (given that ANYTIME is
> flushing every second). So, I'm not sure that this single extra flush is worth
> worrying about. Plus we'd certainly need an extra GetCurrentTimestamp() call, 
> so
> I'm not sure it's worth it.

Yeah, all PGSTAT_MIN_INTERVAL does is throttle pgstat_flush_pending_entries.
Even in the current state, it does not limit how many kinds are flushed, etc.
I consider the ANYTIME flushes the same as just adding another stats kind.
So, I am not really worried about either.

I have some more comments:

-- v2-0001

#1.

+/* When to call pgstat_report_anytime_stat() again */
+#define PGSTAT_ANYTIME_FLUSH_INTERVAL       1000
+

We should just use PGSTAT_MIN_INTERVAL.

#2.

instead of ".flush_behavior", maybe ".flush_mode"? "mode" in the name is better
for configuration fields.

#3.

+/*
+ * Flush behavior for statistics kinds.
+ */
+typedef enum PgStat_FlushBehavior
+{
+       FLUSH_ANYTIME,                          /* All fields can be
flushed anytime,
+                                                                *
including within transactions */
+       FLUSH_AT_TXN_BOUNDARY,          /* All fields can only be flushed at
+                                                                *
transaction boundary */
+} PgStat_FlushBehavior;

FLUSH_AT_TXN_BOUNDARY should be the first value in PgStat_FlushBehavior.
Otherwise kinds ( built-in or custom ) that do not specify a flush_behavior
will default to FLUSH_ANYTIME. I don't think this is what we want.
FLUSH_AT_TXN_BOUNDARY should be the default.

#4. Can we add a test here? Maybe generate some wal inside a long
running transaction and
make sure the stats are updated after > 1 second

-- v2-0002

No comments for this one. With ANYTIME, indeed those flushes are not needed.

-- v2-0003

#1. Should we maybe make this a bit longer? maybe 2 or 3 seconds?
May make the tests slightly longer, but maybe better for test stability.

```
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
```

#2.
+       /*
+        * Check if there are any non-transactional stats to flush. Avoid
+        * unnecessarily locking the entry if nothing accumulated.
+        */
+       if (lstats->counts.numscans > 0 ||
+               lstats->counts.tuples_returned > 0 ||
+               lstats->counts.tuples_fetched > 0 ||
+               lstats->counts.blocks_fetched > 0 ||
+               lstats->counts.blocks_hit > 0)
+               has_nontxn_stats = true;
+
+       if (!has_nontxn_stats)
+               return true;

Can we just do this without a has_nontxn_stats?
This is also the same patter as a regular flush, although
in the case `pg_memory_is_all_zeros` is used.

```
if (lstats->counts.numscans == 0 &&
    lstats->counts.tuples_returned == 0 &&
    lstats->counts.tuples_fetched == 0 &&
    lstats->counts.blocks_fetched == 0 &&
    lstats->counts.blocks_hit == 0)
    return true;
```

#3.
+    are updated while the transactions are in progress. This means
that we can see
+    those statistics being updated without having to wait until the transaction
+    finishes.
+   </para>

The "This means ...... " line used several times does not add value, IMO.
"are updated while the transactions are in progress." is sufficient.

#4.
+  <note>
+   <para>
+    All the statistics are updated while the transactions are in
progress, except
+    for <structfield>xact_commit</structfield>,
<structfield>xact_rollback</structfield>,
+    <structfield>tup_inserted</structfield>,
<structfield>tup_updated</structfield> and
+    <structfield>tup_deleted</structfield> that are updated only when
the transactions
+    finish.
+   </para>
+  </note>

Only these 5 fields from pgstat_relation_flush_anytime_cb, so only the below are
"All the statistics are updated while the transactions are in progress", right?

numscans
tuples_returned
tuples_fetched
blocks_fetched
blocks_hit


--
Sami Imseih
Amazon Web Services (AWS)


Reply via email to