Hi,

On 1/5/23 1:27 AM, Andres Freund wrote:
Hi,

On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote:
diff --git a/src/backend/access/common/relation.c 
b/src/backend/access/common/relation.c
index 4017e175e3..fca166a063 100644
--- a/src/backend/access/common/relation.c
+++ b/src/backend/access/common/relation.c
@@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode)
        if (RelationUsesLocalBuffers(r))
                MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
- pgstat_init_relation(r);
+       if (r->rd_rel->relkind == RELKIND_INDEX)
+               pgstat_init_index(r);
+       else
+               pgstat_init_table(r);
return r;
  }
@@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
        if (RelationUsesLocalBuffers(r))
                MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
- pgstat_init_relation(r);
+       if (r->rd_rel->relkind == RELKIND_INDEX)
+               pgstat_init_index(r);
+       else
+               pgstat_init_table(r);
return r;
  }

Not this patch's fault, but the functions in relation.c have gotten duplicated
to an almost ridiculous degree :(


Thanks for looking at it!
Right, I'll have a look and will try to submit a dedicated patch for this.


diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index 3fb38a25cf..98bb230b95 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, 
BlockNumber blockNum,
         * Read the buffer, and update pgstat counters to reflect a cache hit or
         * miss.
         */
-       pgstat_count_buffer_read(reln);
+       if (reln->rd_rel->relkind == RELKIND_INDEX)
+               pgstat_count_index_buffer_read(reln);
+       else
+               pgstat_count_table_buffer_read(reln);
        buf = ReadBuffer_common(RelationGetSmgr(reln), 
reln->rd_rel->relpersistence,
                                                        forkNum, blockNum, mode, 
strategy, &hit);
        if (hit)
-               pgstat_count_buffer_hit(reln);
+       {
+               if (reln->rd_rel->relkind == RELKIND_INDEX)
+                       pgstat_count_index_buffer_hit(reln);
+               else
+                       pgstat_count_table_buffer_hit(reln);
+       }
        return buf;
  }

Not nice to have additional branches here :(.

Indeed, but that does look like the price to pay for the moment ;-(


I think going forward we should move buffer stats to a "per-relfilenode" stats
entry (which would allow to track writes too), then thiw branch would be
removed again.



Agree. I think the best approach is to have this patch committed and then 
resume working on [1] (which would most probably introduce
the "per-relfilenode" stats.) Does this approach make sense to you?


+/* -------------------------------------------------------------------------
+ *
+ * pgstat_index.c
+ *       Implementation of index statistics.

This is a fair bit of duplicated code. Perhaps it'd be worth keeping
pgstat_relation with code common to table/index stats?


Good point, will look at what can be done.


+bool
+pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+{
+       static const PgStat_IndexCounts all_zeroes;
+       Oid                     dboid;
+
+       PgStat_IndexStatus *lstats; /* pending stats entry  */
+       PgStatShared_Index *shrelcomstats;

What does "com" stand for in shrelcomstats?


Oops, thanks!

This naming is coming from my first try while working on this subject (that I 
did not share).
The idea I had at that time was to create a PGSTAT_KIND_RELATION_COMMON stat 
type for common stats between tables and indexes
and a dedicated one (PGSTAT_KIND_TABLE) for tables (given that indexes would 
have been fully part of the common one).
But it did not work well (specially as we want "dedicated" field names), so I 
preferred to submit the current proposal.

Will fix this bad naming.


+       PgStat_StatIndEntry *indentry;  /* index entry of shared stats */
+       PgStat_StatDBEntry *dbentry;    /* pending database entry */
+
+       dboid = entry_ref->shared_entry->key.dboid;
+       lstats = (PgStat_IndexStatus *) entry_ref->pending;
+       shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats;
+
+       /*
+        * Ignore entries that didn't accumulate any actual counts, such as
+        * indexes that were opened by the planner but not used.
+        */
+       if (memcmp(&lstats->i_counts, &all_zeroes,
+                          sizeof(PgStat_IndexCounts)) == 0)
+       {
+               return true;
+       }

I really need to propose pg_memiszero().


Oh yeah, great idea, that would be easier to read.



  Datum
-pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
+pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS)
  {
        Oid                     relid = PG_GETARG_OID(0);
        int64           result;
@@ -1360,17 +1413,32 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
        PG_RETURN_INT64(result);
  }
+Datum
+pg_stat_get_ind_xact_numscans(PG_FUNCTION_ARGS)
+{
+       Oid                     relid = PG_GETARG_OID(0);
+       int64           result;
+       PgStat_IndexStatus *indentry;
+
+       if ((indentry = find_indstat_entry(relid)) == NULL)
+               result = 0;
+       else
+               result = (int64) (indentry->i_counts.i_numscans);
+
+       PG_RETURN_INT64(result);
+}

Why didn't all these get converted to the same macro based approach as the
!xact versions?


I think the "benefits" was not that "big" as compared to the number of non xact 
ones.
But, good point, now with the tables/indexes split I think it does: I'll submit 
a dedicated patch for it.


  Datum
  pg_stat_get_xact_tuples_returned(PG_FUNCTION_ARGS)
  {
        Oid                     relid = PG_GETARG_OID(0);
        int64           result;
-       PgStat_TableStatus *tabentry;
+       PgStat_IndexStatus *indentry;
- if ((tabentry = find_tabstat_entry(relid)) == NULL)
+       if ((indentry = find_indstat_entry(relid)) == NULL)
                result = 0;
        else
-               result = (int64) (tabentry->t_counts.t_tuples_returned);
+               result = (int64) (indentry->i_counts.i_tuples_returned);
PG_RETURN_INT64(result);
  }

There's a bunch of changes like this, and I don't understand -
pg_stat_get_xact_tuples_returned() now looks at index stats, even though it
afaics continues to be used in pg_stat_xact_all_tables? Huh?



Looks like a mistake (I probably messed up while doing all those changes that "look 
the same"), thanks for pointing out!
I'll go through each one and double check.

+/* ----------
+ * PgStat_IndexStatus                  Per-index status within a backend
+ *
+ * Many of the event counters are nontransactional, ie, we count events
+ * in committed and aborted transactions alike.  For these, we just count
+ * directly in the PgStat_IndexStatus.
+ * ----------
+ */

Which counters are transactional for indexes? None, no?

Right, will fix.


diff --git a/src/test/recovery/t/029_stats_restart.pl 
b/src/test/recovery/t/029_stats_restart.pl
index 83d6647d32..8b0b597419 100644
--- a/src/test/recovery/t/029_stats_restart.pl
+++ b/src/test/recovery/t/029_stats_restart.pl
@@ -43,8 +43,8 @@ my $sect = "initial";
  is(have_stats('database', $dboid, 0), 't', "$sect: db stats do exist");
  is(have_stats('function', $dboid, $funcoid),
        't', "$sect: function stats do exist");
-is(have_stats('relation', $dboid, $tableoid),
-       't', "$sect: relation stats do exist");
+is(have_stats('table', $dboid, $tableoid),
+       't', "$sect: table stats do exist");

Think this should grow a test for an index too. There's not that much point in
the isolation case, because we don't have transactional stats, but here it
seems worth testing?


+1, will do.


[1]: 
https://www.postgresql.org/message-id/flat/20221220181108.e5fddk3g7cive3v6%40alap3.anarazel.de#4efb4ea3593233bdb400bfb25eb30b81

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to