On 2020/03/23 21:01, Fujii Masao wrote:


On 2020/03/23 7:32, Kirill Bychik wrote:
I'm attaching a v5 with fp records only for temp tables, so there's no risk of
instability.  As I previously said I'm fine with your two patches, so unless
you have objections on the fpi test for temp tables or the documentation
changes, I believe those should be ready for committer.

You added the columns into pg_stat_database, but seem to forget to
update the document for pg_stat_database.

Ah right, I totally missed that when I tried to clean up the original POC.

Is it really reasonable to add the columns for vacuum's WAL usage into
pg_stat_database? I'm not sure how much the information about
the amount of WAL generated by vacuum per database is useful.

The amount per database isn't really useful, but I didn't had a better idea on
how to expose (auto)vacuum WAL usage until this:

Isn't it better to make VACUUM VERBOSE and autovacuum log include
that information, instead, to see how much each vacuum activity
generates the WAL? Sorry if this discussion has already been done
upthread.

That's a way better idea!  I'm attaching the full patchset with the 3rd patch
to use this approach instead.  There's a bit a duplicate code for computing the
WalUsage, as I didn't find a better way to avoid that without exposing
WalUsageAccumDiff().

Autovacuum log sample:

2020-03-19 15:49:05.708 CET [5843] LOG:  automatic vacuum of table 
"rjuju.public.t1": index scans: 0
         pages: 0 removed, 2213 remain, 0 skipped due to pins, 0 skipped frozen
         tuples: 250000 removed, 250000 remain, 0 are dead but not yet 
removable, oldest xmin: 502
         buffer usage: 4448 hits, 4 misses, 4 dirtied
         avg read rate: 0.160 MB/s, avg write rate: 0.160 MB/s
         system usage: CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.19 s
         WAL usage: 6643 records, 4 full page records, 1402679 bytes

VACUUM log sample:

# vacuum VERBOSE t1;
INFO:  vacuuming "public.t1"
INFO:  "t1": removed 50000 row versions in 443 pages
INFO:  "t1": found 50000 removable, 0 nonremovable row versions in 443 out of 
443 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 512
There were 50000 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
1332 WAL records, 4 WAL full page records, 306901 WAL bytes
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
INFO:  "t1": truncated 443 to 0 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  vacuuming "pg_toast.pg_toast_16385"
INFO:  index "pg_toast_16385_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16385": found 0 removable, 0 nonremovable row versions in 0 
out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 513
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
0 WAL records, 0 WAL full page records, 0 WAL bytes
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Note that the 3rd patch is an addition on top of Kirill's original patch, as
this is information that would have been greatly helpful to investigate in some
performance issues I had to investigate recently.  I'd be happy to have it land
into v13, but if that's controversial or too late I'm happy to postpone it to
v14 if the infrastructure added in Kirill's patches can make it to v13.

Dear all, can we please focus on getting the core patch committed?
Given the uncertainity regarding autovacuum stats, can we please get
parts 1 and 2 into the codebase, and think about exposing autovacuum
stats later?

Here are the comments for 0001 patch.

+            /*
+             * Report a full page image constructed for the WAL record
+             */
+            pgWalUsage.wal_fp_records++;

Isn't it better to use "fpw" or "fpi" for the variable name rather than
"fp" here? In other places, "fpw" and "fpi" are used for full page
writes/image.

ISTM that this counter could be incorrect if XLogInsertRecord() determines to
calculate again whether FPI is necessary or not. No? IOW, this issue could
happen if XLogInsert() calls  XLogRecordAssemble() multiple times in
its do-while loop. Isn't this problematic?

+    long        wal_bytes;        /* size of wal records produced */

Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
rather than long?

+    shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);

bufusage_space should be walusage_space here?

/*
  * Finish parallel execution.  We wait for parallel workers to finish, and
  * accumulate their buffer usage.
  */

There are some comments mentioning buffer usage, in execParallel.c.
For example, the top comment for ExecParallelFinish(), as the above.
These should be updated.

Here are the comments for 0002 patch.

+    OUT wal_write_bytes int8,
+    OUT wal_write_records int8,
+    OUT wal_write_fp_records int8

Isn't "write" part in the column names confusing because it's WAL
*generated* (not written) by the statement?

+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
+LANGUAGE C STRICT VOLATILE;

PARALLEL SAFE should be specified?

+/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */

ISTM it's good timing to have also pg_stat_statements--1.8.sql since
the definition of pg_stat_statements() is changed. Thought?

+-- CHECKPOINT before WAL tests to ensure test stability
+CHECKPOINT;

Is this true? I thought you added this because the number of FPI
should be larger than zero in the subsequent test. No? But there
seems no such test. I'm not excited about adding the test checking
the number of FPI because it looks fragile, though...

+UPDATE pgss_test SET b = '333' WHERE a = 3 \;
+UPDATE pgss_test SET b = '444' WHERE a = 4 ;

Could you tell me why several queries need to be run to test
the WAL usage? Isn't running a few query enough for the test purpase?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters


Reply via email to