Re: Split index and table statistics into different types of stats
Hi, On Tue, Nov 14, 2023 at 09:04:03AM +0100, Drouvot, Bertrand wrote: > On 11/13/23 9:44 PM, Andres Freund wrote: > > Hi, > > > > It's not nice from a layering POV that we need this level of awareness in > > bufmgr.c. I wonder if this is an argument for first splitting out stats > > like > > blocks_hit, blocks_fetched into something like "relfilenode stats" - they're > > agnostic of the relkind. > > Thanks for looking at it! Yeah I think that would make a lot of sense > to track some stats per relfilenode. > > > There aren't that many such stats right now, > > admittedly, but I think we'll want to also track dirtied, written blocks on > > a > > per relation basis once we can (i.e. we key the relevant stats by > > relfilenode > > instead of oid, so we can associate stats when writing out buffers). > > > > > > Agree. Then, I think that would make sense to start this effort before the > split index/table one. I can work on a per relfilenode stat patch first. > > Does this patch ordering make sense to you? > > 1) Introduce per relfilenode stats > 2) Split index and table stats Just a quick update on this: I had a chat with Andres at pgconf.eu and we agreed on the above ordering so that: 1) I started working on relfilenode stats (I hope to be able to provide a POC patch soon). 2) The CF entry [1] status related to this thread has been changed to "Waiting on Author". [1]: https://commitfest.postgresql.org/47/4792/ Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
Hi, On 11/13/23 9:44 PM, Andres Freund wrote: Hi, On 2023-11-13 09:26:56 +0100, Drouvot, Bertrand wrote: --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -799,11 +799,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, * Read the buffer, and update pgstat counters to reflect a cache hit or * miss. */ - pgstat_count_buffer_read(reln); + if (reln->rd_rel->relkind == RELKIND_INDEX) + pgstat_count_index_buffer_read(reln); + else + pgstat_count_table_buffer_read(reln); It's not nice from a layering POV that we need this level of awareness in bufmgr.c. I wonder if this is an argument for first splitting out stats like blocks_hit, blocks_fetched into something like "relfilenode stats" - they're agnostic of the relkind. Thanks for looking at it! Yeah I think that would make a lot of sense to track some stats per relfilenode. There aren't that many such stats right now, admittedly, but I think we'll want to also track dirtied, written blocks on a per relation basis once we can (i.e. we key the relevant stats by relfilenode instead of oid, so we can associate stats when writing out buffers). Agree. Then, I think that would make sense to start this effort before the split index/table one. I can work on a per relfilenode stat patch first. Does this patch ordering make sense to you? 1) Introduce per relfilenode stats 2) Split index and table stats +/* + * Initialize a relcache entry to count access statistics. Called whenever an + * index is opened. + * + * We assume that a relcache entry's pgstatind_info field is zeroed by relcache.c + * when the relcache entry is made; thereafter it is long-lived data. + * + * This does not create a reference to a stats entry in shared memory, nor + * allocate memory for the pending stats. That happens in + * pgstat_assoc_index(). + */ +void +pgstat_init_index(Relation rel) +{ + /* +* We only count stats for indexes +*/ + Assert(rel->rd_rel->relkind == RELKIND_INDEX); + + if (!pgstat_track_counts) + { + if (rel->pgstatind_info != NULL) + pgstat_unlink_index(rel); + + /* We're not counting at all */ + rel->pgstat_enabled = false; + rel->pgstatind_info = NULL; + return; + } + + rel->pgstat_enabled = true; +} + +/* + * Prepare for statistics for this index to be collected. + * + * This ensures we have a reference to the stats entry before stats can be + * generated. That is important because an index drop in another + * connection could otherwise lead to the stats entry being dropped, which then + * later would get recreated when flushing stats. + * + * This is separate from pgstat_init_index() as it is not uncommon for + * relcache entries to be opened without ever getting stats reported. + */ +void +pgstat_assoc_index(Relation rel) +{ + Assert(rel->pgstat_enabled); + Assert(rel->pgstatind_info == NULL); + + /* Else find or make the PgStat_IndexStatus entry, and update link */ + rel->pgstatind_info = pgstat_prep_index_pending(RelationGetRelid(rel), + rel->rd_rel->relisshared); + + /* don't allow link a stats to multiple relcache entries */ + Assert(rel->pgstatind_info->relation == NULL); + + /* mark this relation as the owner */ + rel->pgstatind_info->relation = rel; +} + +/* + * Break the mutual link between a relcache entry and pending index stats entry. + * This must be called whenever one end of the link is removed. + */ +void +pgstat_unlink_index(Relation rel) +{ + + if (rel->pgstatind_info == NULL) + return; + + /* link sanity check for the index stats */ + if (rel->pgstatind_info) + { + Assert(rel->pgstatind_info->relation == rel); + rel->pgstatind_info->relation = NULL; + rel->pgstatind_info = NULL; + } +} ... This is a fair bit of duplicated code - perhaps we could have shared helpers? Yeah, I had it in mind and that was part of the "Will now work on addressing the up-thread remaining comments" remark I made up-thread. +/* -- + * PgStat_IndexStatus Per-index status within a backend + * + * Many of the event counters are nontransactional, ie, we count events + * in committed and aborted transactions alike. For these, we just count + * directly in the PgStat_IndexStatus. + * -- + */ +typedef struct PgStat_IndexStatus +{ + Oid r_id; /* relation's OID */ + boolr_shared; /* is it a shared catalog? */ + struct PgStat_IndexXactStatus *trans; /* lowest subxact's counts */ + PgStat_IndexCounts counts; /* event
Re: Split index and table statistics into different types of stats
Hi, On 2023-11-13 09:26:56 +0100, Drouvot, Bertrand wrote: > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -799,11 +799,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, > BlockNumber blockNum, >* Read the buffer, and update pgstat counters to reflect a cache hit or >* miss. >*/ > - pgstat_count_buffer_read(reln); > + if (reln->rd_rel->relkind == RELKIND_INDEX) > + pgstat_count_index_buffer_read(reln); > + else > + pgstat_count_table_buffer_read(reln); It's not nice from a layering POV that we need this level of awareness in bufmgr.c. I wonder if this is an argument for first splitting out stats like blocks_hit, blocks_fetched into something like "relfilenode stats" - they're agnostic of the relkind. There aren't that many such stats right now, admittedly, but I think we'll want to also track dirtied, written blocks on a per relation basis once we can (i.e. we key the relevant stats by relfilenode instead of oid, so we can associate stats when writing out buffers). > +/* > + * Initialize a relcache entry to count access statistics. Called whenever > an > + * index is opened. > + * > + * We assume that a relcache entry's pgstatind_info field is zeroed by > relcache.c > + * when the relcache entry is made; thereafter it is long-lived data. > + * > + * This does not create a reference to a stats entry in shared memory, nor > + * allocate memory for the pending stats. That happens in > + * pgstat_assoc_index(). > + */ > +void > +pgstat_init_index(Relation rel) > +{ > + /* > + * We only count stats for indexes > + */ > + Assert(rel->rd_rel->relkind == RELKIND_INDEX); > + > + if (!pgstat_track_counts) > + { > + if (rel->pgstatind_info != NULL) > + pgstat_unlink_index(rel); > + > + /* We're not counting at all */ > + rel->pgstat_enabled = false; > + rel->pgstatind_info = NULL; > + return; > + } > + > + rel->pgstat_enabled = true; > +} > + > +/* > + * Prepare for statistics for this index to be collected. > + * > + * This ensures we have a reference to the stats entry before stats can be > + * generated. That is important because an index drop in another > + * connection could otherwise lead to the stats entry being dropped, which > then > + * later would get recreated when flushing stats. > + * > + * This is separate from pgstat_init_index() as it is not uncommon for > + * relcache entries to be opened without ever getting stats reported. > + */ > +void > +pgstat_assoc_index(Relation rel) > +{ > + Assert(rel->pgstat_enabled); > + Assert(rel->pgstatind_info == NULL); > + > + /* Else find or make the PgStat_IndexStatus entry, and update link */ > + rel->pgstatind_info = pgstat_prep_index_pending(RelationGetRelid(rel), > + > rel->rd_rel->relisshared); > + > + /* don't allow link a stats to multiple relcache entries */ > + Assert(rel->pgstatind_info->relation == NULL); > + > + /* mark this relation as the owner */ > + rel->pgstatind_info->relation = rel; > +} > + > +/* > + * Break the mutual link between a relcache entry and pending index stats > entry. > + * This must be called whenever one end of the link is removed. > + */ > +void > +pgstat_unlink_index(Relation rel) > +{ > + > + if (rel->pgstatind_info == NULL) > + return; > + > + /* link sanity check for the index stats */ > + if (rel->pgstatind_info) > + { > + Assert(rel->pgstatind_info->relation == rel); > + rel->pgstatind_info->relation = NULL; > + rel->pgstatind_info = NULL; > + } > +} > ... This is a fair bit of duplicated code - perhaps we could have shared helpers? > +/* -- > + * PgStat_IndexStatusPer-index status within a > backend > + * > + * Many of the event counters are nontransactional, ie, we count events > + * in committed and aborted transactions alike. For these, we just count > + * directly in the PgStat_IndexStatus. > + * -- > + */ > +typedef struct PgStat_IndexStatus > +{ > + Oid r_id; /* relation's OID */ > + boolr_shared; /* is it a shared catalog? */ > + struct PgStat_IndexXactStatus *trans; /* lowest subxact's counts */ > + PgStat_IndexCounts counts; /* event counts to be sent */ > + Relationrelation; /* rel that is using this entry > */ > +} PgStat_IndexStatus; > + > /* -- > * PgStat_TableXactStatusPer-table, per-subtransaction status > * -- > @@ -227,6 +264,29 @@ typedef struct PgStat_TableXactStatus > } PgStat_TableXactStatus; > > > +/* -- > + * PgStat_IndexXactStatusPer-index,
Re: Split index and table statistics into different types of stats
Hi, On 7/10/23 11:14 AM, Daniel Gustafsson wrote: On 4 Apr 2023, at 12:04, Drouvot, Bertrand wrote: On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote: On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand wrote: My plan was to get [1] done before resuming working on the "Split index and table statistics into different types of stats" one. [1]: https://commitfest.postgresql.org/42/4106/ Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27. There's only a few days left in this CF. Would you like to leave this here? Should it be marked Needs Review or Ready for Commit? Or should we move it to the next CF now? I moved it to the next commitfest and marked the target version as v17. This patch no longer applies (with tests failing when it did), and the thread has stalled. I'm marking this returned with feedback for now, please feel free to resubmit to a future CF with a new version of the patch. Thanks for the update. I'll resume working on it and re-submit once done. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
> On 4 Apr 2023, at 12:04, Drouvot, Bertrand > wrote: > On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote: >> On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand >> wrote: >>> >>> My plan was to get [1] done before resuming working on the "Split index and >>> table statistics into different types of stats" one. >>> [1]: https://commitfest.postgresql.org/42/4106/ >> Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March >> 27. >> There's only a few days left in this CF. Would you like to leave this >> here? Should it be marked Needs Review or Ready for Commit? Or should >> we move it to the next CF now? > > I moved it to the next commitfest and marked the target version as v17. This patch no longer applies (with tests failing when it did), and the thread has stalled. I'm marking this returned with feedback for now, please feel free to resubmit to a future CF with a new version of the patch. -- Daniel Gustafsson
Re: Split index and table statistics into different types of stats
On Tue, Apr 04, 2023 at 12:04:35PM +0200, Drouvot, Bertrand wrote: > I moved it to the next commitfest and marked the target version as > v17. Thanks for moving it. I think that we should be able to do a bit more work for the switch to macros in pgstatfuncs.c, but this is going to require more review than the feature freeze date allow, I am afraid. -- Michael signature.asc Description: PGP signature
Re: Split index and table statistics into different types of stats
Hi, On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote: On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand wrote: My plan was to get [1] done before resuming working on the "Split index and table statistics into different types of stats" one. [1]: https://commitfest.postgresql.org/42/4106/ Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27. There's only a few days left in this CF. Would you like to leave this here? Should it be marked Needs Review or Ready for Commit? Or should we move it to the next CF now? I moved it to the next commitfest and marked the target version as v17. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand wrote: > > My plan was to get [1] done before resuming working on the "Split index and > table statistics into different types of stats" one. > [1]: https://commitfest.postgresql.org/42/4106/ Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27. There's only a few days left in this CF. Would you like to leave this here? Should it be marked Needs Review or Ready for Commit? Or should we move it to the next CF now? -- Gregory Stark As Commitfest Manager
Re: Split index and table statistics into different types of stats
On Thu, Mar 16, 2023 at 10:24:32AM +0100, Drouvot, Bertrand wrote: > My plan was to get [1] done before resuming working on the "Split > index and table statistics into different types of stats" one. Okay, I was unsure what should be the order here. Let's see about [1] first, then. -- Michael signature.asc Description: PGP signature
Re: Split index and table statistics into different types of stats
On Sat, Jan 21, 2023 at 06:42:51AM +0100, Drouvot, Bertrand wrote: > Please find attached a rebased version (previous comments > up-thread still need to be addressed though). This patch has a lot of conflicts. Could you send a rebased version? -- Michael signature.asc Description: PGP signature
Re: Split index and table statistics into different types of stats
On Tue, 3 Jan 2023 at 19:49, Drouvot, Bertrand wrote: > > Hi, > > On 12/10/22 10:54 AM, Drouvot, Bertrand wrote: > > Hi, > > > > On 12/7/22 11:11 AM, Drouvot, Bertrand wrote: > >> Hi, > >> > >>> Hi, > >>> > >>> As [1] mentioned above has been committed (83a1a1b566), please find > >>> attached V5 related to this thread making use of the new macros (namely > >>> PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ). > >>> > >>> I switched from using "CppConcat" to using "##", as it looks to me it's > >>> easier to read now that we are adding another concatenation to the game > >>> (due to the table/index split). > >>> > >>> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" > >>> (I don't have a better idea yet): purpose is to follow the naming > >>> convention for PgStat_StatTabEntry/PgStat_StatIndEntry and > >>> pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry, thoughts? > >>> > >>> Looking forward to your feedback, > >>> > >> > >> Attaching V6 (mandatory rebase due to 8018ffbf58). > >> > >> While at it, I got rid of the weirdness mentioned above by creating 2 sets > >> of macros (one for the tables and one for the indexes). > >> > >> Looking forward to your feedback, > >> > >> Regards, > >> > > > > Attaching V7, mandatory rebase due to 66dcb09246. > > > > Attaching V8, mandatory rebase due to c8e1ba736b. The patch does not apply on top of HEAD as in [1], please post a rebased patch: === Applying patches on top of PostgreSQL commit ID d540a02a724b9643205abce8c5644a0f0908f6e3 === === applying patch ./v8-0001-split_tables_indexes_stats.patch patching file src/backend/utils/activity/pgstat_table.c (renamed from src/backend/utils/activity/pgstat_relation.c) Hunk #25 FAILED at 759. 1 out of 29 hunks FAILED -- saving rejects to file src/backend/utils/activity/pgstat_table.c.rej [1] - http://cfbot.cputube.org/patch_41_3984.log Regards, Vignesh
Re: Split index and table statistics into different types of stats
Hi, On 2023-01-09 17:08:42 +0530, Nitin Jadhav wrote: > +1 to keep common functions/code between table and index stats. Only > the data structure should be different as the goal is to shrink the > current memory usage. I don't think the goal is solely to shrink memory usage - it's also to make it possible to add more stats that are specific to just indexes or just tables. Of course that's related to memory usage... Greetings, Andres Freund
Re: Split index and table statistics into different types of stats
>> +/* - >> + * >> + * pgstat_index.c >> + *Implementation of index statistics. > > This is a fair bit of duplicated code. Perhaps it'd be worth keeping > pgstat_relation with code common to table/index stats? +1 to keep common functions/code between table and index stats. Only the data structure should be different as the goal is to shrink the current memory usage. Thanks & Regards, Nitin Jadhav On Thu, Jan 5, 2023 at 3:35 PM Drouvot, Bertrand wrote: > > Hi, > > On 1/5/23 1:27 AM, Andres Freund wrote: > > Hi, > > > > On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote: > >> diff --git a/src/backend/access/common/relation.c > >> b/src/backend/access/common/relation.c > >> index 4017e175e3..fca166a063 100644 > >> --- a/src/backend/access/common/relation.c > >> +++ b/src/backend/access/common/relation.c > >> @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode) > >> if (RelationUsesLocalBuffers(r)) > >> MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; > >> > >> -pgstat_init_relation(r); > >> +if (r->rd_rel->relkind == RELKIND_INDEX) > >> +pgstat_init_index(r); > >> +else > >> +pgstat_init_table(r); > >> > >> return r; > >> } > >> @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode) > >> if (RelationUsesLocalBuffers(r)) > >> MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; > >> > >> -pgstat_init_relation(r); > >> +if (r->rd_rel->relkind == RELKIND_INDEX) > >> +pgstat_init_index(r); > >> +else > >> +pgstat_init_table(r); > >> > >> return r; > >> } > > > > Not this patch's fault, but the functions in relation.c have gotten > > duplicated > > to an almost ridiculous degree :( > > > > Thanks for looking at it! > Right, I'll have a look and will try to submit a dedicated patch for this. > > > > >> diff --git a/src/backend/storage/buffer/bufmgr.c > >> b/src/backend/storage/buffer/bufmgr.c > >> index 3fb38a25cf..98bb230b95 100644 > >> --- a/src/backend/storage/buffer/bufmgr.c > >> +++ b/src/backend/storage/buffer/bufmgr.c > >> @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber > >> forkNum, BlockNumber blockNum, > >> * Read the buffer, and update pgstat counters to reflect a cache hit > >> or > >> * miss. > >> */ > >> -pgstat_count_buffer_read(reln); > >> +if (reln->rd_rel->relkind == RELKIND_INDEX) > >> +pgstat_count_index_buffer_read(reln); > >> +else > >> +pgstat_count_table_buffer_read(reln); > >> buf = ReadBuffer_common(RelationGetSmgr(reln), > >> reln->rd_rel->relpersistence, > >> forkNum, blockNum, > >> mode, strategy, ); > >> if (hit) > >> -pgstat_count_buffer_hit(reln); > >> +{ > >> +if (reln->rd_rel->relkind == RELKIND_INDEX) > >> +pgstat_count_index_buffer_hit(reln); > >> +else > >> +pgstat_count_table_buffer_hit(reln); > >> +} > >> return buf; > >> } > > > > Not nice to have additional branches here :(. > > Indeed, but that does look like the price to pay for the moment ;-( > > > > > I think going forward we should move buffer stats to a "per-relfilenode" > > stats > > entry (which would allow to track writes too), then thiw branch would be > > removed again. > > > > > > Agree. I think the best approach is to have this patch committed and then > resume working on [1] (which would most probably introduce > the "per-relfilenode" stats.) Does this approach make sense to you? > > > >> +/* > >> - > >> + * > >> + * pgstat_index.c > >> + *Implementation of index statistics. > > > > This is a fair bit of duplicated code. Perhaps it'd be worth keeping > > pgstat_relation with code common to table/index stats? > > > > Good point, will look at what can be done. > > > > >> +bool > >> +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait) > >> +{ > >> +static const PgStat_IndexCounts all_zeroes; > >> +Oid dboid; > >> + > >> +PgStat_IndexStatus *lstats; /* pending stats entry */ > >> +PgStatShared_Index *shrelcomstats; > > > > What does "com" stand for in shrelcomstats? > > > > Oops, thanks! > > This naming is coming from my first try while working on this subject (that I > did not share). > The idea I had at that time was to create a PGSTAT_KIND_RELATION_COMMON stat > type for common stats between tables and indexes > and a dedicated one (PGSTAT_KIND_TABLE) for tables (given that indexes would > have been fully part of the common one). > But it did not work well (specially as we want "dedicated" field names), so I > preferred to submit the current proposal. > > Will fix this bad naming. > > > > >> +PgStat_StatIndEntry
Re: Split index and table statistics into different types of stats
Hi, On 1/5/23 1:27 AM, Andres Freund wrote: Hi, On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote: diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c index 4017e175e3..fca166a063 100644 --- a/src/backend/access/common/relation.c +++ b/src/backend/access/common/relation.c @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode) if (RelationUsesLocalBuffers(r)) MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; - pgstat_init_relation(r); + if (r->rd_rel->relkind == RELKIND_INDEX) + pgstat_init_index(r); + else + pgstat_init_table(r); return r; } @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode) if (RelationUsesLocalBuffers(r)) MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; - pgstat_init_relation(r); + if (r->rd_rel->relkind == RELKIND_INDEX) + pgstat_init_index(r); + else + pgstat_init_table(r); return r; } Not this patch's fault, but the functions in relation.c have gotten duplicated to an almost ridiculous degree :( Thanks for looking at it! Right, I'll have a look and will try to submit a dedicated patch for this. diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 3fb38a25cf..98bb230b95 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, * Read the buffer, and update pgstat counters to reflect a cache hit or * miss. */ - pgstat_count_buffer_read(reln); + if (reln->rd_rel->relkind == RELKIND_INDEX) + pgstat_count_index_buffer_read(reln); + else + pgstat_count_table_buffer_read(reln); buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence, forkNum, blockNum, mode, strategy, ); if (hit) - pgstat_count_buffer_hit(reln); + { + if (reln->rd_rel->relkind == RELKIND_INDEX) + pgstat_count_index_buffer_hit(reln); + else + pgstat_count_table_buffer_hit(reln); + } return buf; } Not nice to have additional branches here :(. Indeed, but that does look like the price to pay for the moment ;-( I think going forward we should move buffer stats to a "per-relfilenode" stats entry (which would allow to track writes too), then thiw branch would be removed again. Agree. I think the best approach is to have this patch committed and then resume working on [1] (which would most probably introduce the "per-relfilenode" stats.) Does this approach make sense to you? +/* - + * + * pgstat_index.c + * Implementation of index statistics. This is a fair bit of duplicated code. Perhaps it'd be worth keeping pgstat_relation with code common to table/index stats? Good point, will look at what can be done. +bool +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait) +{ + static const PgStat_IndexCounts all_zeroes; + Oid dboid; + + PgStat_IndexStatus *lstats; /* pending stats entry */ + PgStatShared_Index *shrelcomstats; What does "com" stand for in shrelcomstats? Oops, thanks! This naming is coming from my first try while working on this subject (that I did not share). The idea I had at that time was to create a PGSTAT_KIND_RELATION_COMMON stat type for common stats between tables and indexes and a dedicated one (PGSTAT_KIND_TABLE) for tables (given that indexes would have been fully part of the common one). But it did not work well (specially as we want "dedicated" field names), so I preferred to submit the current proposal. Will fix this bad naming. + PgStat_StatIndEntry *indentry; /* index entry of shared stats */ + PgStat_StatDBEntry *dbentry;/* pending database entry */ + + dboid = entry_ref->shared_entry->key.dboid; + lstats = (PgStat_IndexStatus *) entry_ref->pending; + shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats; + + /* +* Ignore entries that didn't accumulate any actual counts, such as +* indexes that were opened by the planner but not used. +*/ + if (memcmp(>i_counts, _zeroes, + sizeof(PgStat_IndexCounts)) == 0) + { + return true; + } I really need to propose pg_memiszero(). Oh yeah, great idea, that would be easier to read. Datum -pg_stat_get_xact_numscans(PG_FUNCTION_ARGS) +pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS) { Oid relid = PG_GETARG_OID(0); int64 result; @@
Re: Split index and table statistics into different types of stats
Hi, On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote: > diff --git a/src/backend/access/common/relation.c > b/src/backend/access/common/relation.c > index 4017e175e3..fca166a063 100644 > --- a/src/backend/access/common/relation.c > +++ b/src/backend/access/common/relation.c > @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode) > if (RelationUsesLocalBuffers(r)) > MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; > > - pgstat_init_relation(r); > + if (r->rd_rel->relkind == RELKIND_INDEX) > + pgstat_init_index(r); > + else > + pgstat_init_table(r); > > return r; > } > @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode) > if (RelationUsesLocalBuffers(r)) > MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; > > - pgstat_init_relation(r); > + if (r->rd_rel->relkind == RELKIND_INDEX) > + pgstat_init_index(r); > + else > + pgstat_init_table(r); > > return r; > } Not this patch's fault, but the functions in relation.c have gotten duplicated to an almost ridiculous degree :( > diff --git a/src/backend/storage/buffer/bufmgr.c > b/src/backend/storage/buffer/bufmgr.c > index 3fb38a25cf..98bb230b95 100644 > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, > BlockNumber blockNum, >* Read the buffer, and update pgstat counters to reflect a cache hit or >* miss. >*/ > - pgstat_count_buffer_read(reln); > + if (reln->rd_rel->relkind == RELKIND_INDEX) > + pgstat_count_index_buffer_read(reln); > + else > + pgstat_count_table_buffer_read(reln); > buf = ReadBuffer_common(RelationGetSmgr(reln), > reln->rd_rel->relpersistence, > forkNum, blockNum, > mode, strategy, ); > if (hit) > - pgstat_count_buffer_hit(reln); > + { > + if (reln->rd_rel->relkind == RELKIND_INDEX) > + pgstat_count_index_buffer_hit(reln); > + else > + pgstat_count_table_buffer_hit(reln); > + } > return buf; > } Not nice to have additional branches here :(. I think going forward we should move buffer stats to a "per-relfilenode" stats entry (which would allow to track writes too), then thiw branch would be removed again. > +/* - > + * > + * pgstat_index.c > + * Implementation of index statistics. This is a fair bit of duplicated code. Perhaps it'd be worth keeping pgstat_relation with code common to table/index stats? > +bool > +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait) > +{ > + static const PgStat_IndexCounts all_zeroes; > + Oid dboid; > + > + PgStat_IndexStatus *lstats; /* pending stats entry */ > + PgStatShared_Index *shrelcomstats; What does "com" stand for in shrelcomstats? > + PgStat_StatIndEntry *indentry; /* index entry of shared stats */ > + PgStat_StatDBEntry *dbentry;/* pending database entry */ > + > + dboid = entry_ref->shared_entry->key.dboid; > + lstats = (PgStat_IndexStatus *) entry_ref->pending; > + shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats; > + > + /* > + * Ignore entries that didn't accumulate any actual counts, such as > + * indexes that were opened by the planner but not used. > + */ > + if (memcmp(>i_counts, _zeroes, > +sizeof(PgStat_IndexCounts)) == 0) > + { > + return true; > + } I really need to propose pg_memiszero(). > Datum > -pg_stat_get_xact_numscans(PG_FUNCTION_ARGS) > +pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS) > { > Oid relid = PG_GETARG_OID(0); > int64 result; > @@ -1360,17 +1413,32 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS) > PG_RETURN_INT64(result); > } > > +Datum > +pg_stat_get_ind_xact_numscans(PG_FUNCTION_ARGS) > +{ > + Oid relid = PG_GETARG_OID(0); > + int64 result; > + PgStat_IndexStatus *indentry; > + > + if ((indentry = find_indstat_entry(relid)) == NULL) > + result = 0; > + else > + result = (int64) (indentry->i_counts.i_numscans); > + > + PG_RETURN_INT64(result); > +} Why didn't all these get converted to the same macro based approach as the !xact versions? > Datum > pg_stat_get_xact_tuples_returned(PG_FUNCTION_ARGS) > { > Oid relid = PG_GETARG_OID(0); > int64 result; > - PgStat_TableStatus *tabentry; > + PgStat_IndexStatus *indentry; > > - if ((tabentry = find_tabstat_entry(relid)) == NULL) > + if ((indentry = find_indstat_entry(relid)) == NULL) >
Re: Split index and table statistics into different types of stats
Hi, On 11/22/22 7:19 AM, Bharath Rupireddy wrote: On Mon, Nov 21, 2022 at 7:03 PM Drouvot, Bertrand wrote: On 11/21/22 12:19 AM, Andres Freund wrote: That's better, but still seems like quite a bit of repetition, given the number of accessors. I think I like my idea of a macro defining the whole function a bit better. Got it, what about creating another preparatory commit to first introduce something like: " #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \ Datum \ function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \ { \ Oid relid = PG_GETARG_OID(0); \ int64 result; \ PgStat_StatTabEntry *tabentry; \ if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \ result = 0; \ else \ result = (int64) (tabentry->stat_name); \ PG_RETURN_INT64(result); \ } \ PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans); PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned); . . . " If that makes sense to you, I'll submit this preparatory patch. I think the macros stitching the function declarations and definitions is a great idea to avoid code duplicacy. We seem to be using that approach already - PG_FUNCTION_INFO_V1, SH_DECLARE, SH_DEFINE and its friends, STEMMER_MODULE and so on. +1 for first applying this principle for existing functions. Looking forward to the patch. Thanks! Patch proposal submitted in [1]. I'll resume working on the current thread once [1] is committed. [1]: https://www.postgresql.org/message-id/d547a9bc-76c2-f875-df74-3ad6fd9d6236%40gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
On Mon, Nov 21, 2022 at 7:03 PM Drouvot, Bertrand wrote: > > On 11/21/22 12:19 AM, Andres Freund wrote: > > > > That's better, but still seems like quite a bit of repetition, given the > > number of accessors. I think I like my idea of a macro defining the whole > > function a bit better. > > > > Got it, what about creating another preparatory commit to first introduce > something like: > > " > #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \ > Datum \ > function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \ > { \ > Oid relid = PG_GETARG_OID(0); \ > int64 result; \ > PgStat_StatTabEntry *tabentry; \ > if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \ > result = 0; \ > else \ > result = (int64) (tabentry->stat_name); \ > PG_RETURN_INT64(result); \ > } \ > > PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans); > > PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned); > . > . > . > " > > If that makes sense to you, I'll submit this preparatory patch. I think the macros stitching the function declarations and definitions is a great idea to avoid code duplicacy. We seem to be using that approach already - PG_FUNCTION_INFO_V1, SH_DECLARE, SH_DEFINE and its friends, STEMMER_MODULE and so on. +1 for first applying this principle for existing functions. Looking forward to the patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
Hi, On 11/21/22 12:19 AM, Andres Freund wrote: Hi, On 2022-11-18 12:18:38 +0100, Drouvot, Bertrand wrote: On 11/16/22 9:12 PM, Andres Freund wrote: This still leaves a fair bit of boilerplate. ISTM that the function body really should just be a single line. Might even be worth defining the whole function via a macro. Perhaps something like PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans); Thanks for the feedback! Right, what about something like the following? " #define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, pgstat_fetch_stat_function, relid, stat_name) \ do { \ pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \ PG_RETURN_INT64(entry == NULL ? 0 : (int64) (entry->stat_name)); \ } while (0) Datum pg_stat_get_index_numscans(PG_FUNCTION_ARGS) { PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans); } " That's better, but still seems like quite a bit of repetition, given the number of accessors. I think I like my idea of a macro defining the whole function a bit better. Got it, what about creating another preparatory commit to first introduce something like: " #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \ Datum \ function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \ { \ Oid relid = PG_GETARG_OID(0); \ int64 result; \ PgStat_StatTabEntry *tabentry; \ if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \ result = 0; \ else \ result = (int64) (tabentry->stat_name); \ PG_RETURN_INT64(result); \ } \ PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans); PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned); . . . " If that makes sense to you, I'll submit this preparatory patch. Now merged. Thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
Hi, On 2022-11-18 12:18:38 +0100, Drouvot, Bertrand wrote: > On 11/16/22 9:12 PM, Andres Freund wrote: > > This still leaves a fair bit of boilerplate. ISTM that the function body > > really should just be a single line. > > > > Might even be worth defining the whole function via a macro. Perhaps > > something like > > > > PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, > > numscans); > > Thanks for the feedback! > > Right, what about something like the following? > > " > #define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, > pgstat_fetch_stat_function, relid, stat_name) \ > do { \ > pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \ > PG_RETURN_INT64(entry == NULL ? 0 : (int64) > (entry->stat_name)); \ > } while (0) > > Datum > pg_stat_get_index_numscans(PG_FUNCTION_ARGS) > { > PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, > pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans); > } > " That's better, but still seems like quite a bit of repetition, given the number of accessors. I think I like my idea of a macro defining the whole function a bit better. I'd define a "base" macro and then a version that's specific to tables and indexes each, so that the pieces related to that don't have to be repeated as often. > > This should probably be done in a preparatory commit. > > Proposal submitted in [1]. Now merged. Greetings, Andres Freund
Re: Split index and table statistics into different types of stats
Hi, On 11/16/22 9:12 PM, Andres Freund wrote: Hi, On 2022-11-15 10:48:40 +0100, Drouvot, Bertrand wrote: diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index ae3365d917..be7f175bf1 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -36,24 +36,34 @@ #define HAS_PGSTAT_PERMISSIONS(role) (has_privs_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(), role)) +#define PGSTAT_FETCH_STAT_ENTRY(entry, stat_name) ((entry == NULL) ? 0 : (int64) (entry->stat_name)) + Datum -pg_stat_get_numscans(PG_FUNCTION_ARGS) +pg_stat_get_index_numscans(PG_FUNCTION_ARGS) { Oid relid = PG_GETARG_OID(0); int64 result; - PgStat_StatTabEntry *tabentry; + PgStat_StatIndEntry *indentry = pgstat_fetch_stat_indentry(relid); - if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) - result = 0; - else - result = (int64) (tabentry->numscans); + result = PGSTAT_FETCH_STAT_ENTRY(indentry, numscans); PG_RETURN_INT64(result); } This still leaves a fair bit of boilerplate. ISTM that the function body really should just be a single line. Might even be worth defining the whole function via a macro. Perhaps something like PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans); Thanks for the feedback! Right, what about something like the following? " #define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, pgstat_fetch_stat_function, relid, stat_name) \ do { \ pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \ PG_RETURN_INT64(entry == NULL ? 0 : (int64) (entry->stat_name)); \ } while (0) Datum pg_stat_get_index_numscans(PG_FUNCTION_ARGS) { PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans); } " I think the logic to infer which DB oid to use for a stats entry could be shared between different kinds of stats. We don't need to duplicate it. Agree, will provide a new patch version once [1] is committed. Is there any reason to not replace the "double lookup" in pgstat_fetch_stat_tabentry() with IsSharedRelation()? Thanks for the suggestion! This should probably be done in a preparatory commit. Proposal submitted in [1]. [1]: https://www.postgresql.org/message-id/flat/2e4a0ae1-2696-9f0c-301c-2330e447133f%40gmail.com#e47bf5d2902121461b61ed47413628fc Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
Hi, On 2022-11-15 10:48:40 +0100, Drouvot, Bertrand wrote: > diff --git a/src/backend/utils/adt/pgstatfuncs.c > b/src/backend/utils/adt/pgstatfuncs.c > index ae3365d917..be7f175bf1 100644 > --- a/src/backend/utils/adt/pgstatfuncs.c > +++ b/src/backend/utils/adt/pgstatfuncs.c > @@ -36,24 +36,34 @@ > > #define HAS_PGSTAT_PERMISSIONS(role) (has_privs_of_role(GetUserId(), > ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(), role)) > > +#define PGSTAT_FETCH_STAT_ENTRY(entry, stat_name) ((entry == NULL) ? 0 : > (int64) (entry->stat_name)) > + > Datum > -pg_stat_get_numscans(PG_FUNCTION_ARGS) > +pg_stat_get_index_numscans(PG_FUNCTION_ARGS) > { > Oid relid = PG_GETARG_OID(0); > int64 result; > - PgStat_StatTabEntry *tabentry; > + PgStat_StatIndEntry *indentry = pgstat_fetch_stat_indentry(relid); > > - if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) > - result = 0; > - else > - result = (int64) (tabentry->numscans); > + result = PGSTAT_FETCH_STAT_ENTRY(indentry, numscans); > > PG_RETURN_INT64(result); > } This still leaves a fair bit of boilerplate. ISTM that the function body really should just be a single line. Might even be worth defining the whole function via a macro. Perhaps something like PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans); I think the logic to infer which DB oid to use for a stats entry could be shared between different kinds of stats. We don't need to duplicate it. Is there any reason to not replace the "double lookup" in pgstat_fetch_stat_tabentry() with IsSharedRelation()? This should probably be done in a preparatory commit. Greetings, Andres Freund
Re: Split index and table statistics into different types of stats
Hi, On 11/4/22 9:51 PM, Melanie Plageman wrote: Hi Bertrand, I'm glad you are working on this. I had a few thoughts/ideas Thanks for looking at it! It seems better to have all of the counts in the various stats structs not be prefixed with n_, i_, t_ typedef struct PgStat_StatDBEntry { ... PgStat_Counter n_blocks_fetched; PgStat_Counter n_blocks_hit; PgStat_Counter n_tuples_returned; PgStat_Counter n_tuples_fetched; ... I've attached a patch (0002) to change this in case you are interested in making such a change I did not renamed initially the fields/columns to ease the review. Indeed, I think we should go further than removing the n_, i_ and t_ prefixes so that the fields actually match the view's columns. For example, currently pg_stat_all_indexes.idx_tup_read is linked to "tuples_returned", so that it would make sense to rename "tuples_returned" to "tuples_read" or even "tup_read" in the indexes counters. That's why I had in mind to do this fields/columns renaming into a separate patch (once this one is committed), so that the current one focus only on splitting the stats: what do you think? (I've attached all of my suggestions in patches along with your original patch so that cfbot still passes). On Wed, Nov 2, 2022 at 5:00 AM Drouvot, Bertrand wrote: On 11/1/22 1:30 AM, Andres Freund wrote: On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote: @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS) PG_RETURN_INT64(result); } +Datum +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS) +{ +Oid relid = PG_GETARG_OID(0); +int64 result; +PgStat_StatIndEntry *indentry; + +if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL) +result = 0; +else +result = (int64) (indentry->blocks_fetched); + +PG_RETURN_INT64(result); +} We have so many copies of this by now - perhaps we first should deduplicate them somehow? Even if it's just a macro or such. Yeah good point, a new macro has been defined for the "int64" return case in V3 attached. I looked for other opportunities to de-duplicate, but most of the functions that were added that are identical except the return type and PgStat_Kind are short enough that it doesn't make sense to make wrappers or macros. Yeah, agree. I do think it makes sense to reorder the members of the two structs PgStat_IndexCounts and PgStat_TableCounts so that they have a common header. I've done that in the attached patch (0003). That's a good idea, thanks! But for that we would need to have the same fields names, means: - Remove the prefixes (as you've done in 0002) - And probably reduce the number of fields in the new PgStat_RelationCounts that 003 is introducing (for example tuples_returned should be excluded if we're going to rename it later on to "tuples_read" for the indexes to map the pg_stat_all_indexes.idx_tup_read column). ISTM that we should do it in the "renaming" effort, after this patch is committed. In the flush functions, I was also thinking it might be nice to use the same pattern as is used in [1] and [2] to add the counts together. It makes the lines a bit easier to read, IMO. If we remove the prefixes from the count fields, this works for many of the fields. I've attached a patch (0004) that does something like this, in case you wanted to go in this direction. I like it too but same remarks as previously. I think it should be part of the "renaming" effort. Since you have made new parallel functions for indexes and tables for many of the functions in pgstat_relation.c, perhaps it makes sense to split it into pgstat_table.c and pgstat_index.c? Good point, thanks, I'll work on it. One question I had about the original code (not related to your refactor) is why the pending stats aren't memset in the flush functions after aggregating them into the shared stats. Not sure I'm getting your point, do you think something is not right with the flush functions? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Split index and table statistics into different types of stats
Hi, On 11/1/22 1:30 AM, Andres Freund wrote: Hi, On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote: Please find attached a patch proposal to split index and table statistics into different types of stats. This idea has been proposed by Andres in a couple of threads, see [1] and [2]. Thanks for working on this! Thanks for looking at it! diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 5b49cc5a09..8a715db82e 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid) RelationDropStorage(rel); /* ensure that stats are dropped if transaction commits */ - pgstat_drop_relation(rel); + pgstat_drop_heap(rel); I don't think "heap" is a good name for these, even if there's some historical reasons for it. Particularly because you used "table" in some bits and pieces too. Agree, replaced by "table" where appropriate in V3 attached. /* @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel) void pgstat_create_relation(Relation rel) { - pgstat_create_transactional(PGSTAT_KIND_RELATION, - rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, - RelationGetRelid(rel)); + if (rel->rd_rel->relkind == RELKIND_INDEX) + pgstat_create_transactional(PGSTAT_KIND_INDEX, + rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, + RelationGetRelid(rel)); + else + pgstat_create_transactional(PGSTAT_KIND_TABLE, + rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, + RelationGetRelid(rel)); +} Hm - why is this best handled on this level, rather than at the caller? Agree that it should be split in pgstat_create_table()/pgstat_create_index() (also as it was already split for the "drop" case): done in V3. +/* + * Support function for the SQL-callable pgstat* functions. Returns + * the collected statistics for one index or NULL. NULL doesn't mean + * that the index doesn't exist, just that there are no statistics, so the + * caller is better off to report ZERO instead. + */ +PgStat_StatIndEntry * +pgstat_fetch_stat_indentry(Oid relid) +{ + PgStat_StatIndEntry *indentry; + + indentry = pgstat_fetch_stat_indentry_ext(false, relid); + if (indentry != NULL) + return indentry; + + /* +* If we didn't find it, maybe it's a shared index. +*/ + indentry = pgstat_fetch_stat_indentry_ext(true, relid); + return indentry; +} + +/* + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify + * whether the to-be-accessed index is shared or not. + */ +PgStat_StatIndEntry * +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid) +{ + Oid dboid = (shared ? InvalidOid : MyDatabaseId); + + return (PgStat_StatIndEntry *) + pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid); } Do we need this split anywhere for now? I suspect not, the table case is mainly for the autovacuum launcher, which won't look at indexes "in isolation". Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()). @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS) PG_RETURN_INT64(result); } +Datum +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int64 result; + PgStat_StatIndEntry *indentry; + + if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL) + result = 0; + else + result = (int64) (indentry->blocks_fetched); + + PG_RETURN_INT64(result); +} We have so many copies of this by now - perhaps we first should deduplicate them somehow? Even if it's just a macro or such. Yeah good point, a new macro has been defined for the "int64" return case in V3 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comdiff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index fe80b8b0ba..a1defc6838 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -582,7 +582,7 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot) >xs_heap_continue, _dead); if (found) - pgstat_count_heap_fetch(scan->indexRelation); +
Re: Split index and table statistics into different types of stats
Hi, On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote: > Please find attached a patch proposal to split index and table statistics > into different types of stats. > > This idea has been proposed by Andres in a couple of threads, see [1] and > [2]. Thanks for working on this! > diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c > index 5b49cc5a09..8a715db82e 100644 > --- a/src/backend/catalog/heap.c > +++ b/src/backend/catalog/heap.c > @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid) > RelationDropStorage(rel); > > /* ensure that stats are dropped if transaction commits */ > - pgstat_drop_relation(rel); > + pgstat_drop_heap(rel); I don't think "heap" is a good name for these, even if there's some historical reasons for it. Particularly because you used "table" in some bits and pieces too. > /* > @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel) > void > pgstat_create_relation(Relation rel) > { > - pgstat_create_transactional(PGSTAT_KIND_RELATION, > - > rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, > - > RelationGetRelid(rel)); > + if (rel->rd_rel->relkind == RELKIND_INDEX) > + pgstat_create_transactional(PGSTAT_KIND_INDEX, > + > rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, > + > RelationGetRelid(rel)); > + else > + pgstat_create_transactional(PGSTAT_KIND_TABLE, > + > rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId, > + > RelationGetRelid(rel)); > +} Hm - why is this best handled on this level, rather than at the caller? > +/* > + * Support function for the SQL-callable pgstat* functions. Returns > + * the collected statistics for one index or NULL. NULL doesn't mean > + * that the index doesn't exist, just that there are no statistics, so the > + * caller is better off to report ZERO instead. > + */ > +PgStat_StatIndEntry * > +pgstat_fetch_stat_indentry(Oid relid) > +{ > + PgStat_StatIndEntry *indentry; > + > + indentry = pgstat_fetch_stat_indentry_ext(false, relid); > + if (indentry != NULL) > + return indentry; > + > + /* > + * If we didn't find it, maybe it's a shared index. > + */ > + indentry = pgstat_fetch_stat_indentry_ext(true, relid); > + return indentry; > +} > + > +/* > + * More efficient version of pgstat_fetch_stat_indentry(), allowing to > specify > + * whether the to-be-accessed index is shared or not. > + */ > +PgStat_StatIndEntry * > +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid) > +{ > + Oid dboid = (shared ? InvalidOid : MyDatabaseId); > + > + return (PgStat_StatIndEntry *) > + pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid); > } Do we need this split anywhere for now? I suspect not, the table case is mainly for the autovacuum launcher, which won't look at indexes "in isolation". > @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS) > PG_RETURN_INT64(result); > } > > +Datum > +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS) > +{ > + Oid relid = PG_GETARG_OID(0); > + int64 result; > + PgStat_StatIndEntry *indentry; > + > + if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL) > + result = 0; > + else > + result = (int64) (indentry->blocks_fetched); > + > + PG_RETURN_INT64(result); > +} We have so many copies of this by now - perhaps we first should deduplicate them somehow? Even if it's just a macro or such. Greetings, Andres Freund
Re: Split index and table statistics into different types of stats
Hi, On 10/31/22 2:31 PM, Justin Pryzby wrote: I didn't looks closely, but there's a couple places where you wrote ";;", which looks unintentional. - PG_RETURN_TIMESTAMPTZ(tabentry->lastscan); + PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);; Thanks for looking at it! oops, thanks for the keen eyes ;-) Fixed in v2 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comdiff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index fe80b8b0ba..a1defc6838 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -582,7 +582,7 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot) >xs_heap_continue, _dead); if (found) - pgstat_count_heap_fetch(scan->indexRelation); + pgstat_count_index_fetch(scan->indexRelation); /* * If we scanned a whole HOT chain and found only dead tuples, tell index diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 5b49cc5a09..8a715db82e 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid) RelationDropStorage(rel); /* ensure that stats are dropped if transaction commits */ - pgstat_drop_relation(rel); + pgstat_drop_heap(rel); /* * Close relcache entry, but *keep* AccessExclusiveLock on the relation diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 61f1d3926a..28b94fef7f 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1752,7 +1752,7 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName) changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId); /* copy over statistics from old to new index */ - pgstat_copy_relation_stats(newClassRel, oldClassRel); + pgstat_copy_index_stats(newClassRel, oldClassRel); /* Copy data of pg_statistic from the old index to the new one */ CopyStatistics(oldIndexId, newIndexId); @@ -2326,7 +2326,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) RelationDropStorage(userIndexRelation); /* ensure that stats are dropped if transaction commits */ - pgstat_drop_relation(userIndexRelation); + pgstat_drop_index(userIndexRelation); /* * Close and flush the index's relcache entry, to ensure relcache doesn't diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2d8104b090..e92e50edf7 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -655,13 +655,13 @@ CREATE VIEW pg_stat_all_tables AS C.oid AS relid, N.nspname AS schemaname, C.relname AS relname, -pg_stat_get_numscans(C.oid) AS seq_scan, -pg_stat_get_lastscan(C.oid) AS last_seq_scan, -pg_stat_get_tuples_returned(C.oid) AS seq_tup_read, -sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan, -max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan, -sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint + -pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch, +pg_stat_get_heap_numscans(C.oid) AS seq_scan, +pg_stat_get_heap_lastscan(C.oid) AS last_seq_scan, +pg_stat_get_heap_tuples_returned(C.oid) AS seq_tup_read, +sum(pg_stat_get_index_numscans(I.indexrelid))::bigint AS idx_scan, +max(pg_stat_get_index_lastscan(I.indexrelid)) AS last_idx_scan, +sum(pg_stat_get_index_tuples_fetched(I.indexrelid))::bigint + +pg_stat_get_heap_tuples_fetched(C.oid) AS idx_tup_fetch, pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins, pg_stat_get_tuples_updated(C.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(C.oid) AS n_tup_del, @@ -689,9 +689,9 @@ CREATE VIEW pg_stat_xact_all_tables AS C.oid AS relid, N.nspname AS schemaname, C.relname AS relname, -pg_stat_get_xact_numscans(C.oid) AS seq_scan, +pg_stat_get_heap_xact_numscans(C.oid) AS seq_scan, pg_stat_get_xact_tuples_returned(C.oid) AS seq_tup_read, -sum(pg_stat_get_xact_numscans(I.indexrelid))::bigint AS idx_scan, +sum(pg_stat_get_index_xact_numscans(I.indexrelid))::bigint AS idx_scan, sum(pg_stat_get_xact_tuples_fetched(I.indexrelid))::bigint + pg_stat_get_xact_tuples_fetched(C.oid) AS idx_tup_fetch, pg_stat_get_xact_tuples_inserted(C.oid) AS n_tup_ins, @@ -729,31 +729,31 @@ CREATE VIEW pg_statio_all_tables AS C.oid AS relid,