Re: PATCH: add "--config-file=" option to pg_rewind
Am 07.04.22 um 01:54 schrieb Michael Paquier: > On Wed, Apr 06, 2022 at 02:32:44PM +0200, Gunnar "Nick" Bluth wrote: >> Ta! Sorry, been ridiculously busy these days, and I do confess that I've >> been struggling a bit with those tests before ;-) > > No problem. Double-checked this morning and applied, then. > -- > Michael Wonderful! So, @cyberdemn, who would have thought we get this fixed so quickly? Time to update the Patroni PR once more! ;-) Thanks & best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 06.04.22 um 17:49 schrieb Alvaro Herrera: > On 2022-Apr-06, Robert Haas wrote: > >> Now if we're only incurring that overhead when this feature is >> enabled, then in fairness that problem is a lot less of an issue, >> especially if this is also disabled by default. People who want the >> data can get it and pay the cost, and others aren't much impacted. >> However, experience has taught me that a lot of skepticism is >> warranted when it comes to claims about how cheap extensions to the >> statistics system will be. > > Maybe this feature should provide a way to be enabled for tables > individually, so you pay the overhead only where you need it and don't > swamp the system with stats for uninteresting tables. > That would obviously be very nice (and Georgios pushed heavily in that direction ;-). However, I intentionally bound those stats to the database level (see my very first mail). The changes to get them bound to attributes (or tables) would have required mangling with quite a lot of very elemental stuff, (e.g. attribute stats only get refreshed by ANALYZE and their structure would have to be changed significantly, bloating them even if the feature is inactive). It also would have made the stats updates synchronous (at TX end), would have been "blind" for all TOAST efforts done by rolled back TXs etc. What you can do is of course (just like track_functions): ALTER DATABASE under_surveillance SET track_toast = [on|off]; Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 06.04.22 um 18:55 schrieb Andres Freund: > Hi, > > On 2022-04-06 12:24:20 -0400, Robert Haas wrote: >> On Wed, Apr 6, 2022 at 12:01 PM Gunnar "Nick" Bluth >> wrote: >>> Fair enough. At that point, a lot of things become unexpectedly painful. >>> How many % of the installed base may that be though? >> >> I don't have statistics on that, but it's large enough that the >> expense associated with the statistics collector is a reasonably >> well-known pain point, and for some users, a really severe one. > > Yea. I've seen well over 100MB/s of write IO solely due to stats files writes > on production systems, years ago. Wow. Yeah, I tend to forget there's systems like ads' out there ;-) >> I'm fairly sure it's not going to make things so cheap that we can afford to >> add all the statistics anybody wants, but it's so painful that even modest >> relief would be more than welcome. > > It definitely doesn't make stats free. But I'm hopefull that avoiding the > regular writing out / readin back in, and the ability to only optionally store > some stats (by varying allocation size or just having different kinds of > stats), will reduce the cost sufficiently that we can start keeping more > stats. Knock on wood! > Which is not to say that these stats are the right ones (nor that they're the > wrong ones). ;-) > I think if I were to tackle providing more information about toasting, I'd > start not by adding a new stats view, but by adding a function to pgstattuple > that scans the relation and collects stats for each toasted column. An SRF > returning one row for each toastable column. With information like > > - column name > - #inline datums > - #compressed inline datums > - sum(uncompressed inline datum size) > - sum(compressed inline datum size) > - #external datums > - #compressed external datums > - sum(uncompressed external datum size) > - sum(compressed external datum size) > > IIRC this shouldn't require visiting the toast table itself. But it would still require a seqscan and quite some cycles. However, sure, something like that is an option. > Perhaps also an SRF that returns information about each compression method > separately (i.e. collect above information, but split by compression method)? > Perhaps even with the ability to measure how big the gains of recompressing > into another method would be? Even more of the above, but yeah, sound nifty. >>>> However, experience has taught me that a lot of skepticism is >>>> warranted when it comes to claims about how cheap extensions to the >>>> statistics system will be. >>> >>> Again, fair enough! >>> Maybe we first need statistics about statistics collection and handling? ;-) >> >> Heh. > > I've wondered about adding pg_stat_stats the other day, actually :) > https://postgr.es/m/20220404193435.hf3vybaajlpfmbmt%40alap3.anarazel.de OMG LOL! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 06.04.22 um 17:49 schrieb Robert Haas: > I feel like if you want to know whether externalization made a > difference, you can look at the size of the TOAST table. And by > selecting directly from that table, you can even see how many chunks > it contains, and how many are full-sized chunks vs. partial chunks, > and stuff like that. And for compression, how about looking at > pg_column_size(col1) vs. pg_column_size(col1||'') or something like > that? You might get a 1-byte varlena header on the former and a 4-byte > varlena header on the latter even if there's no compression, but any > gains beyond 3 bytes have to be due to compression. I'll probably give that a shot! It does feel a bit like noting your mileage on fuel receipts though, as I've done until I got my first decent car; works and will work perfectly well up to the day, but certainly is a bit out-of-time (and requires some basic math skills ;-). Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 06.04.22 um 17:22 schrieb Robert Haas: > On Tue, Apr 5, 2022 at 10:34 PM Andres Freund wrote: >>> Anyway, my (undisputed up to now!) understanding still is that only >>> backends _looking_ at these stats (so, e.g., accessing the pg_stat_toast >>> view) actually read the data. So, the 10-15% more space used for pg_stat >>> only affect the stats collector and _some few_ backends. >> >> It's not so simple. That stats collector constantly writes these stats out to >> disk. And disk bandwidth / space is of course a shared resource. > > Yeah, and just to make it clear, this really becomes an issue if you > have hundreds of thousands or even millions of tables. It's a lot of > extra data to be writing, and in some cases we're rewriting it all, > like, once per second. Fair enough. At that point, a lot of things become unexpectedly painful. How many % of the installed base may that be though? I'm far from done reading the patch and mail thread Andres mentioned, but I think the general idea is to move the stats to shared memory, so that reading (and thus, persisting) pg_stats is required far less often, right? > Now if we're only incurring that overhead when this feature is > enabled, then in fairness that problem is a lot less of an issue, > especially if this is also disabled by default. People who want the > data can get it and pay the cost, and others aren't much impacted. That's the idea, yes. I reckon folks with millions of tables will scan through the docs (and postgresql.conf) very thoroughly anyway. Hence the note there. > However, experience has taught me that a lot of skepticism is > warranted when it comes to claims about how cheap extensions to the > statistics system will be. Again, fair enough! Maybe we first need statistics about statistics collection and handling? ;-) Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: PATCH: add "--config-file=" option to pg_rewind
Am 06.04.22 um 14:04 schrieb Michael Paquier: > On Tue, Apr 05, 2022 at 11:10:30AM +0900, Michael Paquier wrote: >> In the tests, the only difference between the modes "archive_cli" and >> "archive" is the extra option given to the pg_rewind command, and >> that's a simple redirect to what pg_rewind would choose by default >> anyway. A more elegant solution would be to have an extra option in >> RewindTest::run_pg_rewind(), where any caller of the routine can feed >> extra options to the pg_rewind command. Now, in this case, we are >> not going to lose any coverage if the existing "archive" mode is >> changed so as we move away postgresql.conf from the target data folder >> and just use --config-file by default, no? I would make the choice of >> simplicity, by giving up on "archive_cli" and using >> primary-postgresql.conf.tmp as value for --config-file. There could >> be an argument for using --config-file for all the modes, as well. > > The clock is ticking, so I have looked at this patch by myself. I Ta! Sorry, been ridiculously busy these days, and I do confess that I've been struggling a bit with those tests before ;-) > have finished by tweaking the docs, the code and the tests as of the > attached. One thing that I found annoying is the lack of description > about the fact that this option affects pg_rewind when it internally > starts the target cluster, as of when we retrieve restore_command and > when we enforce crash recovery to have a target cluster with a clean > shutdown state. The tests are heavily simplified, having no impact on > the run-time while improving the coverage for the code paths changed > here. > -- > Michael LGTM! Thx again! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 05.04.22 um 18:17 schrieb Robert Haas: > On Thu, Mar 31, 2022 at 9:16 AM Gunnar "Nick" Bluth > wrote: >> That was meant to say "v10", sorry! > > Hi, Hi Robert, and thx for looking at this. > From my point of view, at least, it would be preferable if you'd stop > changing the subject line every time you post a new version. Terribly sorry, I believed to do the right thing! I removed the "suffix" now for good. > Based on the test results in > http://postgr.es/m/42bfa680-7998-e7dc-b50e-480cdd986...@pro-open.de > and the comments from Andres in > https://www.postgresql.org/message-id/20211212234113.6rhmqxi5uzgipwx2%40alap3.anarazel.de > my judgement would be that, as things stand today, this patch has no > chance of being accepted, due to overhead. Now, Andres is currently > working on an overhaul of the statistics collector and perhaps that > would reduce the overhead of something like this to an acceptable > level. If it does, that would be great news; I just don't know whether > that's the case. AFAICT, Andres' work is more about the structure (e.g. 13619598f1080d7923454634a2570ca1bc0f2fec). Or I've missed something...? The attached v11 incorporates the latest changes in the area, btw. Anyway, my (undisputed up to now!) understanding still is that only backends _looking_ at these stats (so, e.g., accessing the pg_stat_toast view) actually read the data. So, the 10-15% more space used for pg_stat only affect the stats collector and _some few_ backends. And those 10-15% were gathered with 10.000 tables containing *only* TOASTable attributes. So the actual percentage would probably go down quite a bit once you add some INTs or such. Back then, I was curious myself on the impact and just ran a few syntetic tests quickly hacked together. I'll happily go ahead and run some tests on real world schemas if that helps clarifying matters! > As far as the statistics themselves are concerned, I am somewhat > skeptical about whether it's really worth adding code for this. > According to the documentation, the purpose of the patch is to allow > you to assess choice of storage and compression method settings for a > column and is not intended to be enabled permanently. However, it TBTH, the wording there is probably a bit over-cautious. I very much respect Andres and thus his reservations, and I know how careful the project is about regressions of any kind (see below on some elobarations on the latter). I alleviated the part a bit for v11. > seems to me that you could assess that pretty easily without this > patch: just create a couple of different tables with different > settings, load up the same data via COPY into each one, and see what > happens. Now you might answer that with the patch you would get more > detailed and accurate statistics, and I think that's true, but it > doesn't really look like the additional level of detail would be > critical to have in order to make a proper assessment. You might also > say that creating multiple copies of the table and loading the data > multiple times would be expensive, and that's also true, but you don't > really need to load it all. A representative sample of 1GB or so would > probably suffice in most cases, and that doesn't seem likely to be a > huge load on the system. At the end of the day, one could argue like you did there for almost all (non-attribute) stats. "Why track function execution times? Just set up a benchmark and call the function 1 mio times and you'll know how long it takes on average!". "Why track IO timings? Run a benchmark on your system and ..." etc. pp. I maintain a couple of DBs that house TBs of TOASTable data (mainly XML containing encrypted payloads). In just a couple of columns per cluster. I'm completely clueless if TOAST compression makes a difference there. Or externalization. And I'm not allowed to copy that data anywhere outside production without unrolling a metric ton of red tape. Guess why I started writing this patch ;-) *I* would certainly leave the option on, just to get an idea of what's happening... > Also, as we add more compression options, it's going to be hard to > assess this sort of thing without trying stuff anyway. For example if > you can set the lz4 compression level, you're not going to know which > level is actually going to work best without trying out a bunch of > them and seeing what happens. If we allow access to other sorts of > compression parameters like zstd's "long" option, similarly, if you > really care, you're going to have to try it. Funny that you mention it. When writing the first version, I was thinking about the LZ4 patch authors and was wondering how they tested/benchmarked all of it and why they didn't implement something like this patch for their tests ;-) Yes, you're gonna try it. And you're go
Re: [PATCH] pg_stat_toast v10
Am 31.03.22 um 15:14 schrieb Gunnar "Nick" Bluth: > Am 22.03.22 um 12:23 schrieb Gunnar "Nick" Bluth: >> Am 22.03.22 um 02:17 schrieb Andres Freund: >>> Hi, >>> >>> On 2022-03-08 19:32:03 +0100, Gunnar "Nick" Bluth wrote: >>>> v8 (applies cleanly to today's HEAD/master) attached. >>> >>> This doesn't apply anymore, likely due to my recent pgstat changes - which >>> you'd need to adapt to... >> >> Now, that's been quite an overhaul... kudos! >> >> >>> http://cfbot.cputube.org/patch_37_3457.log >>> >>> Marked as waiting on author. >> >> v9 attached. >> >> TBTH, I don't fully understand all the external/static stuff, but it >> applies to HEAD/master, compiles and passes all tests, so... ;-) > > And v10 catches up to master once again. > > Best, That was meant to say "v10", sorry! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato doc/src/sgml/config.sgml | 26 doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 40 +++ src/backend/catalog/system_views.sql | 20 src/backend/postmaster/pgstat.c | 161 - src/backend/utils/activity/Makefile | 1 + src/backend/utils/activity/pgstat_toast.c | 157 + src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 ++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 src/include/pgstat.h | 110 - src/include/utils/pgstat_internal.h | 1 + src/test/regress/expected/rules.out | 17 +++ src/test/regress/expected/stats.out | 62 ++ src/test/regress/sql/stats.sql| 28 + 17 files changed, 897 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 43e4ade83e..e6f0768472 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7935,6 +7935,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for a column. + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 3b9172f65b..cd0a5bea35 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4946,6 +4957,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + +
Re: [PATCH] pg_stat_toast v9
Am 22.03.22 um 12:23 schrieb Gunnar "Nick" Bluth: > Am 22.03.22 um 02:17 schrieb Andres Freund: >> Hi, >> >> On 2022-03-08 19:32:03 +0100, Gunnar "Nick" Bluth wrote: >>> v8 (applies cleanly to today's HEAD/master) attached. >> >> This doesn't apply anymore, likely due to my recent pgstat changes - which >> you'd need to adapt to... > > Now, that's been quite an overhaul... kudos! > > >> http://cfbot.cputube.org/patch_37_3457.log >> >> Marked as waiting on author. > > v9 attached. > > TBTH, I don't fully understand all the external/static stuff, but it > applies to HEAD/master, compiles and passes all tests, so... ;-) And v10 catches up to master once again. Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato doc/src/sgml/config.sgml | 26 doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 40 +++ src/backend/catalog/system_views.sql | 20 src/backend/postmaster/pgstat.c | 161 - src/backend/utils/activity/Makefile | 1 + src/backend/utils/activity/pgstat_toast.c | 157 + src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 ++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 src/include/pgstat.h | 110 - src/include/utils/pgstat_internal.h | 1 + src/test/regress/expected/rules.out | 17 +++ src/test/regress/expected/stats.out | 62 ++ src/test/regress/sql/stats.sql| 28 + 17 files changed, 897 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 43e4ade83e..e6f0768472 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7935,6 +7935,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for a column. + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 3b9172f65b..cd0a5bea35 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4946,6 +4957,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + + Attribute (column) number in the relation + + + + + + relname name + + + Name of the relation + + + + + + attname name + + +
Re: [PATCH] pg_stat_toast v9
Am 22.03.22 um 02:17 schrieb Andres Freund: > Hi, > > On 2022-03-08 19:32:03 +0100, Gunnar "Nick" Bluth wrote: >> v8 (applies cleanly to today's HEAD/master) attached. > > This doesn't apply anymore, likely due to my recent pgstat changes - which > you'd need to adapt to... Now, that's been quite an overhaul... kudos! > http://cfbot.cputube.org/patch_37_3457.log > > Marked as waiting on author. v9 attached. TBTH, I don't fully understand all the external/static stuff, but it applies to HEAD/master, compiles and passes all tests, so... ;-) Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato doc/src/sgml/config.sgml | 26 doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 40 +++ src/backend/catalog/system_views.sql | 20 src/backend/postmaster/pgstat.c | 161 - src/backend/utils/activity/Makefile | 1 + src/backend/utils/activity/pgstat_toast.c | 157 + src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 ++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 src/include/pgstat.h | 110 - src/include/utils/pgstat_internal.h | 1 + src/test/regress/expected/rules.out | 17 +++ src/test/regress/expected/stats.out | 62 ++ src/test/regress/sql/stats.sql| 28 + 17 files changed, 897 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7a48973b3c..64b4219ded 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7892,6 +7892,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for a column. + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 35b2923c5e..7bbacd67fb 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4942,6 +4953,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + + Attribute (column) number in the relation + + + + + + relname name + + + Name of the relation + + + + + + attname name + + + Name of the attribute (column) + + + + + + storagemethod char + + + Storage method of the attribute + + + + +
Re: PATCH: add "--config-file=" option to pg_rewind
Am 10.03.22 um 14:43 schrieb Michael Banck: > Heya, > > some minor comments, I didn't look at the added test and I did not test > the patch at all, but (as part of the Debian/Ubuntu packaging team) I > think this patch is really important: Much appreciated! [...] > I think we usually just say "data directory"; pgdata was previously only > used as part of the option name, not like this in the text. Fixed. [...] > target-pgdata is the name of the option, but maybe it is useful to spell > out "target data directory" here, even if that means we spill over to > two lines (which we might have to do anyway, that new line is pretty > long). Fixed. > >> @@ -121,6 +123,7 @@ main(int argc, char **argv) >> {"no-sync", no_argument, NULL, 'N'}, >> {"progress", no_argument, NULL, 'P'}, >> {"debug", no_argument, NULL, 3}, >> +{"config-file", required_argument, NULL, 5}, > > Not sure what our policy is, but the way the numbered options are 1, 2, > 4, 3, 5 after this is weird; this was already off before this patch > though. That one has been triggering my inner Monk too ;-) Fixed! [...] > You removed an empty line here. Maybe rather prepend this with a comment > on what's going on, and have en empty line before and afterwards. > Kinde same here. It does smell a bit like "stating the obvious", but alas! Both fixed. Cheers, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato doc/src/sgml/ref/pg_rewind.sgml | 12 + src/bin/pg_rewind/pg_rewind.c | 24 - src/bin/pg_rewind/t/001_basic.pl | 1 + src/bin/pg_rewind/t/RewindTest.pm | 105 +++--- 4 files changed, 101 insertions(+), 41 deletions(-) diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..62fcb71825 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,18 @@ PostgreSQL documentation + + --config-file=filepath + + +When using the -c / --restore-target-wal option, the restore_command +is extracted from the configuration of the target cluster. If the postgresql.conf +of that cluster is not in the target data directory (or if you want to use an alternative config), +use this option to provide a (relative or absolute) path to the postgresql.conf to be used. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..2e83c7ee8e 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -61,6 +61,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -87,6 +88,8 @@ usage(const char *progname) printf(_("Options:\n")); printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" " retrieve WAL files from archives\n")); + printf(_(" --config-file=FILE path to postgresql.conf if it resides outside\n")); + printf(_(" the target data directory (for -c)\n")); printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); printf(_(" --source-server=CONNSTRsource server to synchronize with\n")); @@ -114,13 +117,14 @@ main(int argc, char **argv) {"write-recovery-conf", no_argument, NULL, 'R'}, {"source-pgdata", required_argument, NULL, 1}, {"source-server", required_argument, NULL, 2}, + {"debug", no_argument, NULL, 3}, {"no-ensure-shutdown", no_argument, NULL, 4}, + {"config-file", required_argument, NULL, 5}, {"version", no_argument, NULL, 'V'}, {"restore-target-wal", no_argument, NULL, 'c'}, {"dry-run", no_argument, NULL, 'n'}, {"no-sync", no_argument, NULL, 'N'}, {"progress", no_argument, NULL, 'P'}, - {"debug", no_argument, NULL, 3}, {NULL, 0, NULL, 0} }; int option_index; @@ -205,6 +209,10 @@ main(int argc, char **argv) case 4: no_ensure_shutdown = true; break; + + case 5: +config_file = pg_strdup(optarg); +break; } } @@ -1061,6 +1069,13 @@ getRestoreCommand(c
Re: [PATCH] pg_stat_toast v8
Am 04.01.22 um 12:29 schrieb Gunnar "Nick" Bluth: > Am 03.01.22 um 22:23 schrieb Alvaro Herrera: >> Overall I think this is a good feature to have; assessing the need for >> compression is important for tuning, so +1 for the goal of the patch. > > Much appreciated! > > >> I didn't look into the patch carefully, but here are some minor >> comments: >> >> On 2022-Jan-03, Gunnar "Nick" Bluth wrote: >> >>> @@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext >>> *ttc, int attribute) >>> Datum *value = >ttc_values[attribute]; >>> Datum new_value; >>> ToastAttrInfo *attr = >ttc_attr[attribute]; >>> + instr_time start_time; >>> + INSTR_TIME_SET_CURRENT(start_time); >>> new_value = toast_compress_datum(*value, attr->tai_compression); >>> if (DatumGetPointer(new_value) != NULL) >> >> Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an >> expensive syscall. Find a way to only do it if the feature is enabled. > > Yeah, I was worried about that (and asking if it would be required) > already. > Adding the check was easier than I expected, though I'm absolutely > clueless if I did it right! > > #include "pgstat.h" > extern PGDLLIMPORT bool pgstat_track_toast; > As it works and nobody objected, it seems to be the right way... v8 (applies cleanly to today's HEAD/master) attached. Any takers? -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato doc/src/sgml/config.sgml | 26 +++ doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 40 src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 302 +- src/backend/utils/adt/pgstatfuncs.c | 72 ++ src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 +++ src/include/pgstat.h | 107 - src/test/regress/expected/rules.out | 17 ++ src/test/regress/expected/stats.out | 62 ++ src/test/regress/sql/stats.sql| 28 +++ 14 files changed, 877 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7ed8c82a9d..adbd516373 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7889,6 +7889,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for a column. + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 9fb62fec8e..06258e777a 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4933,6 +4944,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or n
Re: PATCH: add "--config-file=" option to pg_rewind
Am 01.03.22 um 05:18 schrieb Michael Paquier: On Mon, Feb 28, 2022 at 08:48:23PM +0100, Gunnar "Nick" Bluth wrote: So, how should we call the global "find the * 'postgres' executable and boil out if that fails" function? char postgres_exec_path[MAXPGPATH] = findPostgresExec(); ? That would mean only a couple of lines gained, and the readability gained is minimal, so I'd be fine to keep the code as-is. I am not sure about the full patch set yet, but the refactoring of the commands to use PQExpBuffer is good by itself, so I have extracted this part of the patch and applied that for now. Which made the remaining patch v5 (which also appends the --config-file for ensureCleanShutdown) much cleaner :) Cheers, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..d0278e9b46 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,18 @@ PostgreSQL documentation + + --config-file=filepath + + +When using the -c / --restore-target-wal option, the restore_command +is extracted from the configuration of the target cluster. If the postgresql.conf +of that cluster is not in the target pgdata directory (or if you want to use an alternative config), +use this option to provide a (relative or absolute) path to the postgresql.conf to be used. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..294353a9d5 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -61,6 +61,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -87,6 +88,7 @@ usage(const char *progname) printf(_("Options:\n")); printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" " retrieve WAL files from archives\n")); + printf(_(" --config-file=FILE path to postgresql.conf if outside target-pgdata (for -c)\n")); printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); printf(_(" --source-server=CONNSTRsource server to synchronize with\n")); @@ -121,6 +123,7 @@ main(int argc, char **argv) {"no-sync", no_argument, NULL, 'N'}, {"progress", no_argument, NULL, 'P'}, {"debug", no_argument, NULL, 3}, + {"config-file", required_argument, NULL, 5}, {NULL, 0, NULL, 0} }; int option_index; @@ -205,6 +208,10 @@ main(int argc, char **argv) case 4: no_ensure_shutdown = true; break; + + case 5: +config_file = pg_strdup(optarg); +break; } } @@ -1060,7 +1067,11 @@ getRestoreCommand(const char *argv0) /* add -D switch, with properly quoted data directory */ appendPQExpBufferStr(postgres_cmd, " -D "); appendShellString(postgres_cmd, datadir_target); - + if (config_file != NULL) + { + appendPQExpBufferStr(postgres_cmd, " --config_file="); + appendShellString(postgres_cmd, config_file); + } /* add -C switch, for restore_command */ appendPQExpBufferStr(postgres_cmd, " -C restore_command"); @@ -1138,6 +1149,11 @@ ensureCleanShutdown(const char *argv0) /* add set of options with properly quoted data directory */ appendPQExpBufferStr(postgres_cmd, " --single -F -D "); appendShellString(postgres_cmd, datadir_target); + if (config_file != NULL) + { + appendPQExpBufferStr(postgres_cmd, " --config_file="); + appendShellString(postgres_cmd, config_file); + } /* finish with the database name, and a properly quoted redirection */ appendPQExpBufferStr(postgres_cmd, " template1 < "); diff --git a/src/bin/pg_rewind/t/001_basic.pl b/src/bin/pg_rewind/t/001_basic.pl index db9201f38e..3c395ece12 100644 --- a/src/bin/pg_rewind/t/001_basic.pl +++ b/src/bin/pg_rewind/t/001_basic.pl @@ -190,5 +190,6 @@ in primary, before promotion run_test('local'); run_test('remote'); run_test('archive'); +run_test('archive_cli'); done_testing(); diff --git a/src/bin/pg_rewind/t/RewindTest.pm b/src/bin/pg_rewind/t/RewindTest.pm index 5651602858..f93e138b35 100644 --- a/src/bin/pg_rewind/t/RewindTest.pm +++ b/src/bin/pg_rewind/t/RewindTest.pm @@ -208,6 +208,68 @@ sub promote_standby return; } +
Re: PATCH: add "--config-file=" option to pg_rewind
Am 28.02.22 um 12:56 schrieb Michael Paquier: On Sun, Feb 27, 2022 at 02:30:33PM +0100, Gunnar "Nick" Bluth wrote: That's universally true ;-) -# Internal routine to enable archive recovery command on a standby node +# Internal routine to enable archive recovery command on a standby node. +# Returns generated restore_command. sub enable_restoring { my ($self, $root_node, $standby) = @_; @@ -1103,7 +1104,7 @@ restore_command = '$copy_command' { $self->set_recovery_mode(); } - return; + return $copy_command; I don't like this change. This makes an existing routine do some extra work for something it is not designed for. You could get this information with a postgres -C command, for example. Now, you cannot use postgres -C because of.. Reasons (1a9d802). But you could use a pg_ctl command for the same purpose. I guess that we'd better refactor this into a new routine of Cluster.pm where we execute a pg_ctl command and return both stdout and stderr back to the caller? Turns out this change isn't needed anyway (it was copied from the other patch). Afterall, the patch is about extracting the restore_command from the config, so... The TAP part could be refactored on its own. Agreed. I've struggled quite a bit even understanding what's happening in there ;-) +In case the postgresql.conf of your target cluster is not in the +target pgdata and you want to use the -c / --restore-target-wal option, +provide a (relative or absolute) path to the postgresql.conf using this option. This could do a better job to explain in which cases this option can be used for. You could say something like that: "This option specifies a path to a configuration file to be used when starting the target cluster. This makes easier the use of -c/--restore-target-wal by setting restore_command in the extra configuration file specified via this option." I reviewed the wording and think it is pretty universal now. Hmm. What about the target cluster started in --single mode as of ensureCleanShutdown()? Should we try to apply this option also in this case? I'd say so; as far as I can tell, "postgres" would deny starting a (Debian/non-standard) cluster the way the code is right now. Which led me to realize we have /* find postgres executable */ rc = find_other_exec(argv0, "postgres", PG_BACKEND_VERSIONSTR, postgres_exec_path); in getRestoreCommand _and_ /* locate postgres binary */ if ((ret = find_other_exec(argv0, "postgres", PG_BACKEND_VERSIONSTR, exec_path)) < 0) in ensureCleanShutdown. Both with the same error messages, AFAICS. D'oh... can of worms. So, how should we call the global "find the * 'postgres' executable and boil out if that fails" function? char postgres_exec_path[MAXPGPATH] = findPostgresExec(); ? Anyway, v4 attached so we can settle on the tests and wording... -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..d0278e9b46 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,18 @@ PostgreSQL documentation + + --config-file=filepath + + +When using the -c / --restore-target-wal option, the restore_command +is extracted from the configuration of the target cluster. If the postgresql.conf +of that cluster is not in the target pgdata directory (or if you want to use an alternative config), +use this option to provide a (relative or absolute) path to the postgresql.conf to be used. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..b8c92c5590 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -23,6 +23,7 @@ #include "common/restricted_token.h" #include "common/string.h" #include "fe_utils/recovery_gen.h" +#include "fe_utils/string_utils.h" #include "file_ops.h" #include "filemap.h" #include "getopt_long.h" @@ -60,6 +61,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -86,6 +88,7 @@ usage(const char *progname) printf(_(&q
Re: PATCH: add "--config-file=" option to pg_rewind
Am 27.02.22 um 13:06 schrieb Michael Paquier: On Sat, Feb 26, 2022 at 09:55:20AM +0100, Gunnar "Nick" Bluth wrote: Am 26.02.22 um 06:51 schrieb Michael Paquier: Shouldn't this one use appendShellString() on config_file? It probably should, yes. I don't fancy this repetitive code myself. But then, pg_rewind as a whole could use an overhaul. I don't see any use of PQExpBuffer in it, but a lot of char* ... Having a lot of char* does not necessarily mean that all of them have to be changed to accomodate with PQExpBuffer. My guess that this is a case-by-case, where we should apply that in places where it makes the code cleaner to follow. In the case of your patch though, the two areas changed would make the logic correct, instead, because we have to apply correct quoting rules to any commands executed. Alas! v3 attached. GSOC project? ;-) It does not seem so, though there are surely more areas that could gain in clarity. That's universally true ;-) Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..31a1a1dedb 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,17 @@ PostgreSQL documentation + + --config-file=filepath + + +In case the postgresql.conf of your target cluster is not in the +target pgdata and you want to use the -c / --restore-target-wal option, +provide a (relative or absolute) path to the postgresql.conf using this option. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..b8c92c5590 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -23,6 +23,7 @@ #include "common/restricted_token.h" #include "common/string.h" #include "fe_utils/recovery_gen.h" +#include "fe_utils/string_utils.h" #include "file_ops.h" #include "filemap.h" #include "getopt_long.h" @@ -60,6 +61,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -86,6 +88,7 @@ usage(const char *progname) printf(_("Options:\n")); printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" " retrieve WAL files from archives\n")); + printf(_(" --config-file=FILE path to postgresql.conf if outside target-pgdata (for -c)\n")); printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); printf(_(" --source-server=CONNSTRsource server to synchronize with\n")); @@ -120,6 +123,7 @@ main(int argc, char **argv) {"no-sync", no_argument, NULL, 'N'}, {"progress", no_argument, NULL, 'P'}, {"debug", no_argument, NULL, 3}, + {"config-file", required_argument, NULL, 5}, {NULL, 0, NULL, 0} }; int option_index; @@ -204,6 +208,10 @@ main(int argc, char **argv) case 4: no_ensure_shutdown = true; break; + + case 5: +config_file = pg_strdup(optarg); +break; } } @@ -1016,8 +1024,8 @@ getRestoreCommand(const char *argv0) { int rc; char postgres_exec_path[MAXPGPATH], -postgres_cmd[MAXPGPATH], cmd_output[MAXPGPATH]; + PQExpBuffer postgres_cmd; if (!restore_wal) return; @@ -1051,11 +1059,21 @@ getRestoreCommand(const char *argv0) * Build a command able to retrieve the value of GUC parameter * restore_command, if set. */ - snprintf(postgres_cmd, sizeof(postgres_cmd), - "\"%s\" -D \"%s\" -C restore_command", - postgres_exec_path, datadir_target); + postgres_cmd = createPQExpBuffer(); - if (!pipe_read_line(postgres_cmd, cmd_output, sizeof(cmd_output))) + /* path to postgres, properly quoted */ + appendShellString(postgres_cmd, postgres_exec_path); + + appendPQExpBufferStr(postgres_cmd, " -D "); + appendShellString(postgres_cmd, datadir_target); + if (config_file != NULL) + { + appendPQExpBufferStr(postgres_cmd, " --config_file="); + appendShellString(postgres_cmd, config_file); + } + appendPQExpBufferStr(postgres_cmd, " -C restore_command"); + + if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output))) exit(1); (void) pg_strip_crlf(cmd_output); diff --g
Re: PATCH: add "--config-file=" option to pg_rewind
Am 26.02.22 um 06:51 schrieb Michael Paquier: On Fri, Feb 25, 2022 at 10:35:49AM +0100, Gunnar "Nick" Bluth wrote: Am 24.02.22 um 14:46 schrieb Daniel Gustafsson: Actually, I think this looks like a saner approach. Putting a config setting in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe for them diverging. FWIW, I have a bad feeling about passing down directly a command through an option itself part of a command, so what's discussed on this thread is refreshing. Ta. + } else { + snprintf(postgres_cmd, sizeof(postgres_cmd), +"\"%s\" -D \"%s\" --config_file=\"%s\" -C restore_command", +postgres_exec_path, datadir_target, config_file); + } Shouldn't this one use appendShellString() on config_file? It probably should, yes. I don't fancy this repetitive code myself. But then, pg_rewind as a whole could use an overhaul. I don't see any use of PQExpBuffer in it, but a lot of char* ... I myself am not a C hacker (as you can probably tell already) and don't really feel fit (enough) to rewrite pg_rewind.c to use fe_utils/string_utils.h :/ I might give it a try anyhow, but that may be a bit... heavy,,, just to add one option? While we're at it (I'm really good at opening cans of worms): 09:47 $ grep -r -l --include \*.c fe_utils/string_utils.h src/bin/ | wc -l 28 09:48 $ grep -r -L --include \*.c fe_utils/string_utils.h src/bin/ | wc -l 66 GSOC project? ;-) Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: PATCH: add "--config-file=" option to pg_rewind
Am 24.02.22 um 14:46 schrieb Daniel Gustafsson: On 24 Feb 2022, at 14:43, Gunnar Nick Bluth wrote: That looks just as good to me, and it already has tests, so I'll happily step down! Actually, I think this looks like a saner approach. Putting a config setting in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe for them diverging. Well, well, reverted https://commitfest.postgresql.org/37/3562/ back to "needs review" and moved it to "Replication & Recovery". Patch v2 includes: * doc improvements as suggested * tests shamelessly copied & adapted from https://commitfest.postgresql.org/37/3213/ Since the need is pretty obvious (I *think* the Debian-ish userbase is quite significant alone), I'd love to see one of the approaches making it into 15! Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..31a1a1dedb 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,17 @@ PostgreSQL documentation + + --config-file=filepath + + +In case the postgresql.conf of your target cluster is not in the +target pgdata and you want to use the -c / --restore-target-wal option, +provide a (relative or absolute) path to the postgresql.conf using this option. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..d57a3a6852 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -60,6 +60,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -86,6 +87,7 @@ usage(const char *progname) printf(_("Options:\n")); printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" " retrieve WAL files from archives\n")); + printf(_(" --config-file=FILE path to postgresql.conf if outside target-pgdata (for -c)\n")); printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); printf(_(" --source-server=CONNSTRsource server to synchronize with\n")); @@ -120,6 +122,7 @@ main(int argc, char **argv) {"no-sync", no_argument, NULL, 'N'}, {"progress", no_argument, NULL, 'P'}, {"debug", no_argument, NULL, 3}, + {"config-file", required_argument, NULL, 5}, {NULL, 0, NULL, 0} }; int option_index; @@ -204,6 +207,10 @@ main(int argc, char **argv) case 4: no_ensure_shutdown = true; break; + + case 5: +config_file = pg_strdup(optarg); +break; } } @@ -1051,9 +1058,16 @@ getRestoreCommand(const char *argv0) * Build a command able to retrieve the value of GUC parameter * restore_command, if set. */ - snprintf(postgres_cmd, sizeof(postgres_cmd), + if (config_file == NULL) + { + snprintf(postgres_cmd, sizeof(postgres_cmd), "\"%s\" -D \"%s\" -C restore_command", postgres_exec_path, datadir_target); + } else { + snprintf(postgres_cmd, sizeof(postgres_cmd), + "\"%s\" -D \"%s\" --config_file=\"%s\" -C restore_command", + postgres_exec_path, datadir_target, config_file); + } if (!pipe_read_line(postgres_cmd, cmd_output, sizeof(cmd_output))) exit(1); diff --git a/src/bin/pg_rewind/t/001_basic.pl b/src/bin/pg_rewind/t/001_basic.pl index db9201f38e..3c395ece12 100644 --- a/src/bin/pg_rewind/t/001_basic.pl +++ b/src/bin/pg_rewind/t/001_basic.pl @@ -190,5 +190,6 @@ in primary, before promotion run_test('local'); run_test('remote'); run_test('archive'); +run_test('archive_cli'); done_testing(); diff --git a/src/bin/pg_rewind/t/RewindTest.pm b/src/bin/pg_rewind/t/RewindTest.pm index 5651602858..94d413dcb6 100644 --- a/src/bin/pg_rewind/t/RewindTest.pm +++ b/src/bin/pg_rewind/t/RewindTest.pm @@ -208,6 +208,69 @@ sub promote_standby return; } +sub run_pg_rewind_archive +{ + my $test_mode = shift; + my $restore_command; + my $primary_pgdata = $node_primary->data_dir; + + # Remove the existing archive directory and move all WAL + # segments from the old primary to the archives. These + # will be used by pg_rewind. + rmtree($node_primary->archive_dir); + PostgreSQL::Test::RecursiveCopy::copypath($node_primary-&
Re: PATCH: add "--config-file=" option to pg_rewind
Am 24.02.2022 um 14:08 schrieb Daniel Gustafsson: On 24 Feb 2022, at 10:27, Alexander Kukushkin wrote: Hello Gunnar, On Thu, 24 Feb 2022 at 10:03, Aleksander Alekseev wrote: wants to use the "-c" option on a typical Debian/Ubuntu installation (where the config resides below /etc/postgresql/), pg_rewind needs a way to be told where the postgresql.conf actually is. The attached patch implements this as an option. [...] Good catch! Yeah, it is a known problem, and there was already another patch trying to address it [1]. There is yet another related patch which provides a parameter to pass in restore_command in cases where postgresq.conf isn't reachable: https://commitfest.postgresql.org/37/3213/ That looks just as good to me, and it already has tests, so I'll happily step down! (Note to myself: check the CF first next time ;-) -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
PATCH: add "--config-file=" option to pg_rewind
Hi! During a Patroni PR discussion (https://github.com/zalando/patroni/pull/2225), we realised that if one wants to use the "-c" option on a typical Debian/Ubuntu installation (where the config resides below /etc/postgresql/), pg_rewind needs a way to be told where the postgresql.conf actually is. The attached patch implements this as an option. I'm extremely unhappy that I called it "--config-file" here, while "postgres" itself wants "--config_file". But as the other dashed options of pg_rewind are, well, dashed and not underscored, it seemed to be better that the other way... As the "-c" feature appeared in 12 and existing Debian installations are unable to use it right now, I suggest to even backport the patch. Oh, and I'm still not subscribed to -hackers, so the usual "please CC me" applies! Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml index 33e6bb64ad..f80adbf5ae 100644 --- a/doc/src/sgml/ref/pg_rewind.sgml +++ b/doc/src/sgml/ref/pg_rewind.sgml @@ -241,6 +241,17 @@ PostgreSQL documentation + + --config-file=filepath + + +In case the postgresql.conf of your target cluster is not in the +target pgdata and you want to use the -c option, +provide a (relative or absolute) path to the postgresql.conf using this option. + + + + --debug diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..d57a3a6852 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -60,6 +60,7 @@ char *datadir_target = NULL; char *datadir_source = NULL; char *connstr_source = NULL; char *restore_command = NULL; +char *config_file = NULL; static bool debug = false; bool showprogress = false; @@ -86,6 +87,7 @@ usage(const char *progname) printf(_("Options:\n")); printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" " retrieve WAL files from archives\n")); + printf(_(" --config-file=FILE path to postgresql.conf if outside target-pgdata (for -c)\n")); printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); printf(_(" --source-server=CONNSTRsource server to synchronize with\n")); @@ -120,6 +122,7 @@ main(int argc, char **argv) {"no-sync", no_argument, NULL, 'N'}, {"progress", no_argument, NULL, 'P'}, {"debug", no_argument, NULL, 3}, + {"config-file", required_argument, NULL, 5}, {NULL, 0, NULL, 0} }; int option_index; @@ -204,6 +207,10 @@ main(int argc, char **argv) case 4: no_ensure_shutdown = true; break; + + case 5: +config_file = pg_strdup(optarg); +break; } } @@ -1051,9 +1058,16 @@ getRestoreCommand(const char *argv0) * Build a command able to retrieve the value of GUC parameter * restore_command, if set. */ - snprintf(postgres_cmd, sizeof(postgres_cmd), + if (config_file == NULL) + { + snprintf(postgres_cmd, sizeof(postgres_cmd), "\"%s\" -D \"%s\" -C restore_command", postgres_exec_path, datadir_target); + } else { + snprintf(postgres_cmd, sizeof(postgres_cmd), + "\"%s\" -D \"%s\" --config_file=\"%s\" -C restore_command", + postgres_exec_path, datadir_target, config_file); + } if (!pipe_read_line(postgres_cmd, cmd_output, sizeof(cmd_output))) exit(1);
Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 22:23 schrieb Alvaro Herrera: Overall I think this is a good feature to have; assessing the need for compression is important for tuning, so +1 for the goal of the patch. Much appreciated! I didn't look into the patch carefully, but here are some minor comments: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: @@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int attribute) Datum *value = >ttc_values[attribute]; Datum new_value; ToastAttrInfo *attr = >ttc_attr[attribute]; + instr_time start_time; + INSTR_TIME_SET_CURRENT(start_time); new_value = toast_compress_datum(*value, attr->tai_compression); if (DatumGetPointer(new_value) != NULL) Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an expensive syscall. Find a way to only do it if the feature is enabled. Yeah, I was worried about that (and asking if it would be required) already. Adding the check was easier than I expected, though I'm absolutely clueless if I did it right! #include "pgstat.h" extern PGDLLIMPORT bool pgstat_track_toast; This also suggests that perhaps it'd be a good idea to allow this to be enabled for specific tables only, rather than system-wide. (Maybe in order for stats to be collected, the user should have to both set the GUC option *and* set a per-table option? Not sure.) That would of course be nice, but I seriously doubt the required additional logic would be justified. The patch currently tampers with as few internal structures as possible, and for good reason... ;-) @@ -82,10 +82,12 @@ typedef enum StatMsgType PGSTAT_MTYPE_DEADLOCK, PGSTAT_MTYPE_CHECKSUMFAILURE, PGSTAT_MTYPE_REPLSLOT, + PGSTAT_MTYPE_CONNECTION, I think this new enum value doesn't belong in this patch. Yeah, did I mention I'm struggling with rebasing? ;-| +/* -- + * PgStat_ToastEntry Per-TOAST-column info in a MsgFuncstat + * -- Not in "a MsgFuncstat", right? Obviously... fixed! +-- function to wait for counters to advance +create function wait_for_stats() returns void as $$ I don't think we want a separate copy of wait_for_stats; see commit fe60b67250a3 and the discussion leading to it. Maybe you'll want to move the test to stats.sql. I'm not sure what to say about relying on Did so. LZ4; maybe you'll want to leave that part out, and just verify in an LZ4-enabled build that some 'l' entry exists. BTW, don't we have any decent way to turn that 'l' into a more reasonable, descriptive string? Also, perhaps make the view-defining query turn an empty compression method into whatever the default is. I'm not even sure that having it in there is useful at all. It's simply JOINed in from pg_attribute. Which is where I'd see that "make it look nicer" change happening, tbth. ;-) Or even better, stats collection should store the real compression method used rather than empty string, to avoid confusing things when some stats are collected, then the default is changed, then some more stats are collected. I was thinking about that already, but came to the conclusion that it a) would blow up the size of these statistics by quite a bit and b) would be quite tricky to display in a useful way. I mean, the use case of track_toast is pretty limited anyway; you'll probably turn this feature on with a specific column in mind, of which you'll probably know which compression method is used at the time. Thanks for the feedback! v7 attached. -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom f07213d68c646ba64757e551e3587aab0ff221df Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Tue, 4 Jan 2022 12:08:32 +0100 Subject: [PATCH v7] pg_stat_toast v7 --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 44 ++- src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 305 +- src/backend/utils/adt/pgstatfuncs.c | 72 + src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 ++ src/include/pgstat.h | 107 ++ src/test/regress/expected/rules.out | 17 + src/test/regress/expected/stats.out | 62 src/test/regress/sql/stats.sql| 28 ++ 14 files changed, 882 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/
Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 22:03 schrieb Justin Pryzby: +pgstat_report_toast_activity(Oid relid, int attr, + bool externalized, + bool compressed, + int32 old_size, + int32 new_size, ... + if (new_size) + { + htabent->t_counts.t_size_orig+=old_size; + if (new_size) + { I guess one of these is supposed to say old_size? Didn't make a difference, tbth, as they'd both be 0 or have a value. Streamlined the whole block now. +CREATE TABLE toast_test (cola TEXT, colb TEXT COMPRESSION lz4, colc TEXT , cold TEXT, cole TEXT); Is there a reason this uses lz4 ? I thought it might help later on, but alas! the LZ4 column mainly broke things, so I removed it for the time being. If that's needed for stable results, I think you should use pglz, since that's what's guaranteed to exist. I imagine LZ4 won't be required any time soon, seeing as zlib has never been required. Yeah. It didn't prove anything whatsoever. +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). saying "a column" is fine Changed. + schemaname name + Attribute (column) number in the relation + relname name + + compressmethod char + + + Compression method of the attribute (empty means default) One thing to keep in mind is that the current compression method is only used for *new* data - old data can still use the old compression method. It probably doesn't need to be said here, but maybe you can refer to the docs about that in alter_table. + Number of times the compression was successful (gained a size reduction) It's more clear to say "was reduced in size" Changed the wording a bit, I guess it is clear enough now. The question is if the column should be there at all, as it's simply fetched from pg_attribute... + /* we assume this inits to all zeroes: */ + static const PgStat_ToastCounts all_zeroes; You don't have to assume; static/global allocations are always zero unless otherwise specified. Copy-pasta ;-) Removed. Thx for looking into this! Patch v7 will be in the next mail. -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 20:11 schrieb Alvaro Herrera: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: 9:38 $ git format-patch PGDG/master -v5 -o .. ../v5-0001-ping-pong-of-thougths.patch ../v5-0002-ping-pong-of-thougths.patch ../v5-0003-adds-some-debugging-messages-in-toast_helper.c.patch ... Hmm, in such cases I would suggest to create a separate branch and then "git merge --squash" for submission. You can keep your development branch separate, with other merges if you want. I've found this to be easier to manage, though I don't always follow that workflow myself. Using --stdout does help ;-) I wonder why "track_toast.sql" test fails on Windows (with "ERROR: compression method lz4 not supported"), but "compression.sql" doesn't. Any hints? Anyway, I shamelessly copied "wait_for_stats()" from the "stats.sql" file and the tests _should_ now work at least on the platforms with lz4. v6 attached! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom e743587fbd8f6592bbfa15f53733f79c405000e2 Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Mon, 3 Jan 2022 20:35:05 +0100 Subject: [PATCH v6] pg_stat_toast v6 --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/monitoring.sgml | 163 + doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 24 ++ src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 309 +- src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 ++ src/include/pgstat.h | 108 ++ src/test/regress/expected/rules.out | 17 + src/test/regress/expected/track_toast.out | 102 ++ src/test/regress/parallel_schedule| 2 +- src/test/regress/sql/track_toast.sql | 64 15 files changed, 946 insertions(+), 8 deletions(-) create mode 100644 src/test/regress/expected/track_toast.out create mode 100644 src/test/regress/sql/track_toast.sql diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + +
Re: [PATCH] pg_stat_toast v0.4
Am 03.01.22 um 19:30 schrieb Alvaro Herrera: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: Am 03.01.22 um 17:50 schrieb Justin Pryzby: Soon you will think this is fun :) As long as you're happy with plain patches like the attached one, I may ;-) Well, with a zero-byte patch, not very much ... D'oh BTW why do you number with a "0." prefix? It could just be "4" and "5" and so on. There's no value in two-part version numbers for patches. Also, may I suggest that "git format-patch -vN" with varying N is an easier way to generate patches to attach? Not when you have a metric ton of commits in the history... I'll hopefully find a way to start over soon :/ 9:38 $ git format-patch PGDG/master -v5 -o .. ../v5-0001-ping-pong-of-thougths.patch ../v5-0002-ping-pong-of-thougths.patch ../v5-0003-adds-some-debugging-messages-in-toast_helper.c.patch ../v5-0004-adds-some-groundwork-for-pg_stat_toast-to-pgstat..patch ../v5-0005-fixes-wrong-type-for-pgstat_track_toast-GUC.patch ../v5-0006-introduces-PgStat_BackendAttrIdentifier-OID-attr-.patch ../v5-0007-implements-and-calls-pgstat_report_toast_activity.patch ../v5-0008-Revert-adds-some-debugging-messages-in-toast_help.patch ../v5-0009-adds-more-detail-to-logging.patch ../v5-0010-adds-toastactivity-to-table-stats-and-many-helper.patch ../v5-0011-fixes-missed-replacement-in-comment.patch ../v5-0012-adds-SQL-support-functions.patch ../v5-0013-Add-SQL-functions.patch ../v5-0014-reset-to-HEAD.patch ../v5-0015-makes-DEBUG2-messages-more-precise.patch ../v5-0016-adds-timing-information-to-stats-and-view.patch ../v5-0017-adds-a-basic-set-of-tests.patch ../v5-0018-adds-a-basic-set-of-tests.patch ../v5-0019-chooses-a-PGSTAT_TOAST_HASH_SIZE-of-64-changes-ha.patch ../v5-0020-removes-whitespace-trash.patch ../v5-0021-returns-to-PGDG-master-.gitignore.patch ../v5-0022-pg_stat_toast_v0.5.patch ../v5-0023-moves-tests-to-regress.patch But alas! INT versioning it is from now on! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + + Attribute (column) number in the relation + + + + + + relname name + + +
Re: [PATCH] pg_stat_toast v0.4
Am 03.01.22 um 17:50 schrieb Justin Pryzby: On Mon, Jan 03, 2022 at 05:00:45PM +0100, Gunnar "Nick" Bluth wrote: Am 03.01.22 um 16:52 schrieb Gunnar "Nick" Bluth: pg_stat_toast_v0.4.patch attached. Note that the cfbot says this fails under windows Thanks for the heads up! http://cfbot.cputube.org/gunnar-quotnickquot-bluth.html ... [16:47:05.347] Could not determine contrib module type for track_toast [16:47:05.347] at src/tools/msvc/mkvcbuild.pl line 31. Not only Window$... as it turns out, one of the checks was pretty bogus. Kicked that one and instead wrote two (hopefully) meaningful ones. Also, I moved the tests to regress/, as they're not really for a module anyway. Let's see how this fares! Aaaand I attached a former version of the patch file... sorry, I'm kind of struggling with all the squashing/rebasing... Soon you will think this is fun :) As long as you're happy with plain patches like the attached one, I may ;-) All the best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast v0.4
Am 03.01.22 um 16:52 schrieb Gunnar "Nick" Bluth: pg_stat_toast_v0.4.patch attached. Aaaand I attached a former version of the patch file... sorry, I'm kind of struggling with all the squashing/rebasing... -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom 560cfb4082804709d952b3f741b505025f400f97 Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Mon, 3 Jan 2022 16:18:36 +0100 Subject: [PATCH] pg_stat_toast_v0.4 --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/monitoring.sgml | 163 + doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 24 ++ src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 309 +- src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 ++ src/include/pgstat.h | 108 ++ src/test/modules/Makefile | 1 + src/test/modules/track_toast/Makefile | 18 + .../track_toast/expected/track_toast.out | 55 .../modules/track_toast/sql/track_toast.sql | 20 ++ src/test/modules/track_toast/track_toast.conf | 1 + src/test/regress/expected/rules.out | 17 + 17 files changed, 874 insertions(+), 7 deletions(-) create mode 100644 src/test/modules/track_toast/Makefile create mode 100644 src/test/modules/track_toast/expected/track_toast.out create mode 100644 src/test/modules/track_toast/sql/track_toast.sql create mode 100644 src/test/modules/track_toast/track_toast.conf diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + + Attribute (column) number in the relation + + + + + + relname name + + + Name of the relation + + + + + + attname name + + + Name of the attribute (column) + + + + + + storagemethod char + + + Storage method of the attribute + +
Re: [PATCH] pg_stat_toast v0.4
Am 21.12.21 um 13:51 schrieb kuroda.hay...@fujitsu.com: Dear Gunnar, Hayato-san, all, thanks for the feedback! * gathering timing information Here is a minor comment: I'm not sure when we should start to measure time. If you want to record time spent for compressing, toast_compress_datum() should be sandwiched by INSTR_TIME_SET_CURRENT() and caclurate the time duration. Currently time_spent is calcurated in the pgstat_report_toast_activity(), but this have a systematic error. If you want to record time spent for inserting/updating, however, I think we should start measuring at the upper layer, maybe heap_toast_insert_or_update(). Yes, both toast_compress_datum() and toast_save_datum() are sandwiched the way you mentioned, as that's exactly what we want to measure (time spent doing compression and/or externalizing data). Implementation-wise, I (again) took "track_functions" as a template there, assuming that jumping into pgstat_report_toast_activity() and only then checking if "track_toast = on" isn't too costly (we call pgstat_init_function_usage() and pgstat_end_function_usage() a _lot_). I did miss though that INSTR_TIME_SET_CURRENT(time_spent); should be called right after entering pgstat_report_toast_activity(), as that might need additional clock ticks for setting up the hash etc. That's fixed now. What I can't assess is the cost of the unconditional call to INSTR_TIME_SET_CURRENT(start_time) in both toast_tuple_try_compression() and toast_tuple_externalize(). Would it be wise (cheaper) to add a check like if (pgstat_track_toast) before querying the system clock? Any hints on how to write meaningful tests would be much appreciated! I.e., will a test I searched hints from PG sources, and I thought that modules/ subdirectory can be used. Following is the example: https://github.com/postgres/postgres/tree/master/src/test/modules/commit_ts I attached a patch to create a new test. Could you rewrite the sql and the output file? Thanks for the kickstart ;-) Some tests (as meaningful as they may get, I'm afraid) are now in src/test/modules/track_toast/. "make check-world" executes them successfully, although only after I introduced a "SELECT pg_sleep(1);" to them. pg_stat_toast_v0.4.patch attached. Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Catodiff --git a/.gitignore b/.gitignore index 794e35b73c..ca2ee84742 100644 --- a/.gitignore +++ b/.gitignore @@ -31,6 +31,8 @@ win32ver.rc *.exe lib*dll.def lib*.pc +tags +/.vscode # Local excludes in root directory /GNUmakefile @@ -42,3 +44,5 @@ lib*.pc /Debug/ /Release/ /tmp_install/ +.gitignore +pg_stat_toast.* diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variab
[PATCH] pg_stat_toast v0.3
Am 12.12.21 um 17:20 schrieb Gunnar "Nick" Bluth: Hello -hackers! Please have a look at the attached patch, which implements some statistics for TOAST. The attached v0.3 includes * a proper GUC "track_toast" incl. postgresql.conf.sample line * gathering timing information * the system view "pg_stat_toast" * naming improvements more than welcome! * columns "storagemethod" and "compressmethod" added as per Hayato Kuroda's suggestion * documentation (pointing out the potential impacts as per Andres Freund's reservations) Any hints on how to write meaningful tests would be much appreciated! I.e., will a test INSERTing some long text to a column raises the view counts and "compressedsize" is smaller than "originalsize" suffice?!? Cheers & best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom e4f6b8a9b398ff08f76e0c02ca7a1201062beb78 Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Mon, 20 Dec 2021 14:09:36 +0100 Subject: [PATCH] pg_stat_toast v0.3 * adds timing information to stats and view * adds system view * adds documentation * adds GUC parameter to postgresql.conf diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + + Attribute (column) number in the relation + + + + + + relname name + + + Name of the relation + + + + + + attname name + + + Name of the attribute (column) + + + + + + storagemethod char + + + Storage method of the attribute + + + + + + externalized bigint + + + Number of times this attribute was externalized (pushed to TOAST relation) + + + + + + compressmethod char + + + Compression method of the attribute (empty means default) + + + + + + compressattempts bigint + + + Number of times this attribute was compressed + + + + + + compresssuccesses bigint + + + Number of times the
Re: [PATCH] pg_stat_toast
Am 20.12.2021 um 04:20 schrieb kuroda.hay...@fujitsu.com: Dear Gunnar, Hi Kuroda-San! postgres=# CREATE TABLE test (i int, lz4 text COMPRESSION lz4, std text); postgres=# INSERT INTO test SELECT i,repeat(md5(i::text),100),repeat(md5(i::text),100) FROM generate_series(0,10) x(i); postgres=# SELECT * FROM pg_stat_toast WHERE schemaname = 'public'; -[ RECORD 1 ]+-- schemaname | public reloid | 16829 attnum | 2 relname | test attname | lz4 externalizations | 0 compressions | 11 compressionsuccesses | 11 compressionsizesum | 6299710 originalsizesum | 320403204 -[ RECORD 2 ]+-- schemaname | public reloid | 16829 attnum | 3 relname | test attname | std externalizations | 0 compressions | 11 compressionsuccesses | 11 compressionsizesum | 8198819 originalsizesum | 320403204 I'm not sure about TOAST, but currently compressions are configurable: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bbe0a81db69bd10bd166907c3701492a29aca294 How about adding a new attribute "method" to pg_stat_toast? ToastAttrInfo *attr->tai_compression represents how compress the data, so I think it's easy to add. Or, is it not needed because pg_attr has information? That information could certainly be included in the view, grabbing the information from pg_attribute.attcompression. It probably should! I guess the next step will be to include that view in the catalog anyway, so I'll do that next. Thx for the feedback! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast
Am 13.12.21 um 00:41 schrieb Andres Freund: Hi, On 2021-12-13 00:00:23 +0100, Gunnar "Nick" Bluth wrote: Regarding stats size; it adds one PgStat_BackendToastEntry (PgStat_BackendAttrIdentifier + PgStat_ToastCounts, should be 56-64 bytes or something in that ballpark) per TOASTable attribute, I can't see that make any system break sweat ;-) That's actually a lot. The problem is that all the stats data for a database is loaded into private memory for each connection to that database, and that the stats collector regularly writes out all the stats data for a database. My understanding is that the stats file is only pulled into the backend when the SQL functions (for the view) are used (see "pgstat_fetch_stat_toastentry()"). Otherwise, a backend just initializes an empty hash, right? Of which I reduced the initial size from 512 to 32 for the below tests (I guess the "truth" lies somewhere in between here), along with making the GUC parameter an actual GUC parameter and disabling the elog() calls I scattered all over the place ;-) for the v0.2 patch attached. A quick run comparing 1.000.000 INSERTs (2 TOASTable columns each) with and without "pgstat_track_toast" resulted in 12792.882 ms vs. 12810.557 ms. So at least the call overhead seems to be neglectible. Yea, you'd probably need a few more tables and a few more connections for it to have a chance of mattering meaningfully. So, I went ahead and * set up 2 clusters with "track_toast" off and on resp. * created 100 DBs * each with 100 tables * with one TOASTable column in each table * filling those with 32000 bytes of md5 garbage These clusters sum up to ~ 2GB each, so differences should _start to_ show up, I reckon. $ du -s testdb* 2161208 testdb 2163240 testdb_tracking $ du -s testdb*/pg_stat 4448testdb/pg_stat 4856testdb_tracking/pg_stat The db_*.stat files are 42839 vs. 48767 bytes each (so confirmed, the differences do show). No idea if this is telling us anything, tbth, but the /proc//smaps_rollup for a backend serving one of these DBs look like this ("0 kB" lines omitted): track_toast OFF === Rss: 12428 kB Pss:5122 kB Pss_Anon: 1310 kB Pss_File: 2014 kB Pss_Shmem: 1797 kB Shared_Clean: 5864 kB Shared_Dirty: 3500 kB Private_Clean: 1088 kB Private_Dirty: 1976 kB Referenced:11696 kB Anonymous: 2120 kB track_toast ON (view not called yet): = Rss: 12300 kB Pss:4883 kB Pss_Anon: 1309 kB Pss_File: 1888 kB Pss_Shmem: 1685 kB Shared_Clean: 6040 kB Shared_Dirty: 3468 kB Private_Clean: 896 kB Private_Dirty: 1896 kB Referenced:11572 kB Anonymous: 2116 kB track_toast ON (view called): = Rss: 15408 kB Pss:7482 kB Pss_Anon: 2083 kB Pss_File: 2572 kB Pss_Shmem: 2826 kB Shared_Clean: 6616 kB Shared_Dirty: 3532 kB Private_Clean: 1472 kB Private_Dirty: 3788 kB Referenced:14704 kB Anonymous: 2884 kB That backend used some memory for displaying the result too, of course... A backend with just two TOAST columns in one table (filled with 1.000.001 rows) looks like this before and after calling the "pg_stat_toast" view: Rss: 146208 kB Pss: 116181 kB Pss_Anon: 2050 kB Pss_File: 2787 kB Pss_Shmem:111342 kB Shared_Clean: 6636 kB Shared_Dirty: 45928 kB Private_Clean: 1664 kB Private_Dirty: 91980 kB Referenced: 145532 kB Anonymous: 2844 kB Rss: 147736 kB Pss: 103296 kB Pss_Anon: 2430 kB Pss_File: 3147 kB Pss_Shmem: 97718 kB Shared_Clean: 6992 kB Shared_Dirty: 74056 kB Private_Clean: 1984 kB Private_Dirty: 64704 kB Referenced: 147092 kB Anonymous: 3224 kB After creating 10.000 more tables (view shows 10.007 rows now), before and after calling "TABLE pg_stat_toast": Rss: 13816 kB Pss:4898 kB Pss_Anon: 1314 kB Pss_File: 1755 kB Pss_Shmem: 1829 kB Shared_Clean: 5972 kB Shared_Dirty: 5760 kB Private_Clean: 832 kB Private_Dirty: 1252 kB Referenced:13088 kB Anonymous: 2124 kB Rss: 126816 kB Pss: 55213 kB Pss_Anon: 5383 kB Pss_File: 2615 kB Pss_Shmem: 47215 kB Shared_Clean: 6460 kB Shared_Dirty: 113028 kB Private_Clean: 1600 kB Private_Dirty: 5728 kB Referenced: 126112 kB Anonymous: 6184 kB That DB's stat-file is now 4.119.254 bytes (3.547.439 without track_toast). After VACUUM ANALYZE, the size goes up to 5.919.812 (5
Re: [PATCH] pg_stat_toast
Am 12.12.21 um 22:52 schrieb Andres Freund: Hi, On 2021-12-12 17:20:58 +0100, Gunnar "Nick" Bluth wrote: Please have a look at the attached patch, which implements some statistics for TOAST. The idea (and patch) have been lurking here for quite a while now, so I decided to dust it off, rebase it to HEAD and send it out for review today. A big shoutout to Georgios Kokolatos, who gave me a crash course in PG hacking, some very useful hints and valueable feedback early this year. I'd like to get some feedback about the general idea, approach, naming etc. before refining this further. I'm worried about the additional overhead this might impose. For some workload it'll substantially increase the amount of stats traffic. Have you tried to qualify the overheads? Both in stats size and in stats management overhead? I'd lie if I claimed so... Regarding stats size; it adds one PgStat_BackendToastEntry (PgStat_BackendAttrIdentifier + PgStat_ToastCounts, should be 56-64 bytes or something in that ballpark) per TOASTable attribute, I can't see that make any system break sweat ;-) A quick run comparing 1.000.000 INSERTs (2 TOASTable columns each) with and without "pgstat_track_toast" resulted in 12792.882 ms vs. 12810.557 ms. So at least the call overhead seems to be neglectible. Obviously, this was really a quick run and doesn't reflect real life. I'll have the machine run some reasonable tests asap, also looking at stat size, of course! Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
[PATCH] pg_stat_toast
Hello -hackers! Please have a look at the attached patch, which implements some statistics for TOAST. The idea (and patch) have been lurking here for quite a while now, so I decided to dust it off, rebase it to HEAD and send it out for review today. A big shoutout to Georgios Kokolatos, who gave me a crash course in PG hacking, some very useful hints and valueable feedback early this year. I'd like to get some feedback about the general idea, approach, naming etc. before refining this further. I'm not a C person and I s**k at git, so please be kind with me! ;-) Also, I'm not subscribed here, so a CC would be much appreciated! Why gather TOAST statistics? TOAST is transparent and opaque at the same time. Whilst we know that it's there and we know _that_ it works, we cannot generally tell _how well_ it works. What we can't answer (easily) are questions like e.g. - how many datums have been externalized? - how many datums have been compressed? - how often has a compression failed (resulted in no space saving)? - how effective is the compression algorithm used on a column? - how much time did the DB spend compressing/decompressing TOAST values? The patch adds some functionality that will eventually be able to answer these (and probably more) questions. Currently, #1 - #4 can be answered based on the view contained in "pg_stats_toast.sql": postgres=# CREATE TABLE test (i int, lz4 text COMPRESSION lz4, std text); postgres=# INSERT INTO test SELECT i,repeat(md5(i::text),100),repeat(md5(i::text),100) FROM generate_series(0,10) x(i); postgres=# SELECT * FROM pg_stat_toast WHERE schemaname = 'public'; -[ RECORD 1 ]+-- schemaname | public reloid | 16829 attnum | 2 relname | test attname | lz4 externalizations | 0 compressions | 11 compressionsuccesses | 11 compressionsizesum | 6299710 originalsizesum | 320403204 -[ RECORD 2 ]+-- schemaname | public reloid | 16829 attnum | 3 relname | test attname | std externalizations | 0 compressions | 11 compressionsuccesses | 11 compressionsizesum | 8198819 originalsizesum | 320403204 Implementation == I added some callbacks in backend/access/table/toast_helper.c to "pgstat_report_toast_activity" in backend/postmaster/pgstat.c. The latter (and the other additions there) are essentially 1:1 copies of the function statistics. Those were the perfect template, as IMHO the TOAST activities (well, what we're interested in at least) are very much comparable to function calls: a) It doesn't really matter if the TOASTed data was committed, as "the damage is done" (i.e. CPU cycles were used) anyway b) The information can (thus/best) be stored on DB level, no need to touch the relation or attribute statistics I didn't find anything that could have been used as a hash key, so the PgStat_StatToastEntry uses the shiny new PgStat_BackendAttrIdentifier (containing relid Oid, attr int). For persisting in the statsfile, I chose the identifier 'O' (as 'T' was taken). What's working? === - Gathering of TOAST externalization and compression events - collecting the sizes before and after compression - persisting in statsfile - not breaking "make check" - not crashing anything (afaict) What's missing (yet)? === - proper definition of the "pgstat_track_toast" GUC - Gathering of times (for compression [and decompression?]) - improve "pg_stat_toast" view and include it in the catalog - documentation (obviously) - proper naming (of e.g. the hash key type, functions, view columns etc.) - would it be necessary to implement overflow protection for the size & time sums? Thanks in advance & best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom 05f81229fd2e81b8674649c8fd1e3857e2406fcd Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Sun, 12 Dec 2021 15:35:35 +0100 Subject: [PATCH] initial patch of pg_stat_toast for -hackers --- pg_stat_toast.sql | 19 ++ src/backend/access/table/toast_helper.c | 19 ++ src/backend/postmaster/pgstat.c | 311 +++- src/backend/utils/adt/pgstatfuncs.c | 60 + src/include/catalog/pg_proc.dat | 21 ++ src/include/pgstat.h| 109 + 6 files changed, 533 insertions(+), 6 deletions(-) create mode 100644 pg_stat_toast.sql diff --git a/pg_stat_toast.sql b/pg_stat_toast.sql new file mode 100644 index 00..1c653254ab --- /dev/nu