subject:"Fix parallel vacuum buffer usage reporting"

Re: Fix parallel vacuum buffer usage reporting

2024-05-13 Thread Nazir Bilal Yavuz

Hi,

On Fri, 10 May 2024 at 19:09, Masahiko Sawada  wrote:
>
> On Fri, May 10, 2024 at 7:26 PM Nazir Bilal Yavuz  wrote:
> >
> > Hi,
> >
> > Thank you for working on this!
> >
> > On Wed, 1 May 2024 at 06:37, Masahiko Sawada  wrote:
> > >
> > > Thank you for further testing! I've pushed the patch.
> >
> > I realized a behaviour change while looking at 'Use pgBufferUsage for
> > block reporting in analyze' thread [1]. Since that change applies here
> > as well, I thought it is better to mention it here.
> >
> > Before this commit, VacuumPageMiss did not count the blocks if its
> > read was already completed by other backends [2]. Now,
> > 'pgBufferUsage.local_blks_read + pgBufferUsage.shared_blks_read'
> > counts every block attempted to be read; possibly double counting if
> > someone else has already completed the read.
>
> True. IIUC there is such a difference only in HEAD but not in v15 and
> v16. The following comment in WaitReadBuffers() says that it's a
> traditional behavior that we count blocks as read even if someone else
> has already completed its I/O:
>
> /*
>  * We count all these blocks as read by this backend.  This is traditional
>  * behavior, but might turn out to be not true if we find that someone
>  * else has beaten us and completed the read of some of these blocks.  In
>  * that case the system globally double-counts, but we traditionally don't
>  * count this as a "hit", and we don't have a separate counter for "miss,
>  * but another backend completed the read".
>  */
>
> So I think using pgBufferUsage for (parallel) vacuum is a legitimate
> usage and makes it more consistent with other parallel operations.

That sounds logical. Thank you for the clarification.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Masahiko Sawada

On Fri, May 10, 2024 at 7:26 PM Nazir Bilal Yavuz  wrote:
>
> Hi,
>
> Thank you for working on this!
>
> On Wed, 1 May 2024 at 06:37, Masahiko Sawada  wrote:
> >
> > Thank you for further testing! I've pushed the patch.
>
> I realized a behaviour change while looking at 'Use pgBufferUsage for
> block reporting in analyze' thread [1]. Since that change applies here
> as well, I thought it is better to mention it here.
>
> Before this commit, VacuumPageMiss did not count the blocks if its
> read was already completed by other backends [2]. Now,
> 'pgBufferUsage.local_blks_read + pgBufferUsage.shared_blks_read'
> counts every block attempted to be read; possibly double counting if
> someone else has already completed the read.

True. IIUC there is such a difference only in HEAD but not in v15 and
v16. The following comment in WaitReadBuffers() says that it's a
traditional behavior that we count blocks as read even if someone else
has already completed its I/O:

/*
 * We count all these blocks as read by this backend.  This is traditional
 * behavior, but might turn out to be not true if we find that someone
 * else has beaten us and completed the read of some of these blocks.  In
 * that case the system globally double-counts, but we traditionally don't
 * count this as a "hit", and we don't have a separate counter for "miss,
 * but another backend completed the read".
 */

So I think using pgBufferUsage for (parallel) vacuum is a legitimate
usage and makes it more consistent with other parallel operations.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Nazir Bilal Yavuz

Hi,

On Fri, 10 May 2024 at 16:55, Nazir Bilal Yavuz  wrote:
>
> Hi,
>
> On Fri, 10 May 2024 at 16:21, Nazir Bilal Yavuz  wrote:
> >
> > Hi,
> >
> > On Fri, 10 May 2024 at 14:49, Alena Rybakina  
> > wrote:
> > >
> > > Hi! I could try to check it with the test, but I want to ask you about
> > > details, because I'm not sure that I completely understand the test case.
> > >
> > > You mean that we need to have two backends and on one of them we deleted
> > > the tuples before vacuum called the other, do you?
> > >

There should be some other backend(s) which will try to read the same
buffer with the ongoing VACUUM operation. I think it works now but the
reproduction steps are a bit racy. See:

1- Build Postgres with attached diff, it is the same
see_previous_output.diff that I shared two mails ago.

2- Run Postgres, all settings are default.

3- Use two client backends, let's name them as A and B client backends.

4- On A client backend, run:

CREATE TABLE vacuum_fix (aid int, bid int) with (autovacuum_enabled=false);
INSERT INTO vacuum_fix SELECT *, * FROM generate_series(1, 2000);
VACUUM vacuum_fix;
UPDATE vacuum_fix SET aid = aid + 1, bid = bid + 1;

5- Now it will be a bit racy, SQL commands below need to be run at the
same time. The aim is for VACUUM on A client backend and SELECT on B
client backend to read the same buffers at the same time. So, some of
the buffers will be double counted.

Firstly, run VACUUM on A client backend; immediately after running
VACUUM, run SELECT on B backend.

A client backend:
VACUUM VERBOSE vacuum_fix;

B client backend:
SELECT * from vacuum_fix WHERE aid = -1;

This is the output of the VACUUM VERBOSE on my end:

INFO:  vacuuming "test.public.vacuum_fix"
INFO:  finished vacuuming "test.public.vacuum_fix": index scans: 0
pages: 0 removed, 176992 remain, 176992 scanned (100.00% of total)
...
...
buffer usage: 254181 hits, 99030 misses in the previous version, 99865
misses in the patched version, 106830 dirtied
...
VACUUM
Time: 2578.217 ms (00:02.578)

VACUUM does not run parallel, so this test case does not trigger what
is fixed in this thread. As it can be seen, there is ~1000 buffers
difference.

I am not sure if there is an easier way to reproduce this but I hope this helps.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 84cc983b6e6..582973d575b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -309,6 +309,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	PgStat_Counter startreadtime = 0,
 startwritetime = 0;
 	WalUsage	startwalusage = pgWalUsage;
+	int64		StartPageMiss = VacuumPageMiss;
 	BufferUsage startbufferusage = pgBufferUsage;
 	ErrorContextCallback errcallback;
 	char	  **indnames = NULL;
@@ -606,6 +607,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			StringInfoData buf;
 			char	   *msgfmt;
 			int32		diff;
+			int64		PageMissOp = VacuumPageMiss - StartPageMiss;
 			double		read_rate = 0,
 		write_rate = 0;
 
@@ -748,8 +750,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			appendStringInfo(, _("avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"),
 			 read_rate, write_rate);
 			appendStringInfo(,
-			 _("buffer usage: %lld hits, %lld misses, %lld dirtied\n"),
+			 _("buffer usage: %lld hits, %lld misses in the previous version, %lld misses in the patched version, %lld dirtied\n"),
 			 (long long) (bufferusage.shared_blks_hit + bufferusage.local_blks_hit),
+			 (long long) (PageMissOp),
 			 (long long) (bufferusage.shared_blks_read + bufferusage.local_blks_read),
 			 (long long) (bufferusage.shared_blks_dirtied + bufferusage.local_blks_dirtied));
 			appendStringInfo(,

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Nazir Bilal Yavuz

Hi,

On Fri, 10 May 2024 at 16:21, Nazir Bilal Yavuz  wrote:
>
> Hi,
>
> On Fri, 10 May 2024 at 14:49, Alena Rybakina  wrote:
> >
> > Hi! I could try to check it with the test, but I want to ask you about
> > details, because I'm not sure that I completely understand the test case.
> >
> > You mean that we need to have two backends and on one of them we deleted
> > the tuples before vacuum called the other, do you?
> >
>
> I think triggering a parallel vacuum is enough. I am able to see the
> differences with the following:
>
> You can apply the attached diff file to see the differences between
> the previous version and the patched version. Then, run this query:
>
> CREATE TABLE vacuum_fix (aid int, bid int, cid int) with
> (autovacuum_enabled=false);
> INSERT INTO vacuum_fix SELECT *, *, * FROM generate_series(1, 100);
> CREATE INDEX a_idx on vacuum_fix (aid);
> CREATE INDEX b_idx on vacuum_fix (bid);
> CREATE INDEX c_idx on vacuum_fix (cid);
> VACUUM vacuum_fix;
> UPDATE vacuum_fix SET aid = aid + 1;
> VACUUM (VERBOSE, PARALLEL 2) vacuum_fix ;
>
> After that I saw:
>
> INFO:  vacuuming "test.public.vacuum_fix"
> INFO:  launched 2 parallel vacuum workers for index vacuuming (planned: 2)
> INFO:  finished vacuuming "test.public.vacuum_fix": index scans: 1
> ...
> ...
> buffer usage: 29343 hits, 9580 misses in the previous version, 14165
> misses in the patched version, 14262 dirtied
>
> Patched version counts 14165 misses but the previous version counts
> 9580 misses in this specific example.

I am sorry that I showed the wrong thing, this is exactly what is
fixed in this patch. Actually, I do not know how to trigger it;
currently I am looking for it. I will share if anything comes to my
mind.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 49637284f91..a6f1df11066 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1355,6 +1355,7 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 	IOContext	io_context;
 	IOObject	io_object;
 	char		persistence;
+	static int  double_counts = 0;
 
 	/*
 	 * Currently operations are only allowed to include a read of some range,
@@ -1426,6 +1427,7 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 			  operation->smgr->smgr_rlocator.locator.relNumber,
 			  operation->smgr->smgr_rlocator.backend,
 			  true);
+			double_counts++;
 			continue;
 		}
 
@@ -1523,6 +1525,7 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 		if (VacuumCostActive)
 			VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
 	}
+	elog(LOG, "Double counts = %d", double_counts);
 }
 
 /*

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Nazir Bilal Yavuz

Hi,

On Fri, 10 May 2024 at 14:49, Alena Rybakina  wrote:
>
> Hi! I could try to check it with the test, but I want to ask you about
> details, because I'm not sure that I completely understand the test case.
>
> You mean that we need to have two backends and on one of them we deleted
> the tuples before vacuum called the other, do you?
>

I think triggering a parallel vacuum is enough. I am able to see the
differences with the following:

You can apply the attached diff file to see the differences between
the previous version and the patched version. Then, run this query:

CREATE TABLE vacuum_fix (aid int, bid int, cid int) with
(autovacuum_enabled=false);
INSERT INTO vacuum_fix SELECT *, *, * FROM generate_series(1, 100);
CREATE INDEX a_idx on vacuum_fix (aid);
CREATE INDEX b_idx on vacuum_fix (bid);
CREATE INDEX c_idx on vacuum_fix (cid);
VACUUM vacuum_fix;
UPDATE vacuum_fix SET aid = aid + 1;
VACUUM (VERBOSE, PARALLEL 2) vacuum_fix ;

After that I saw:

INFO:  vacuuming "test.public.vacuum_fix"
INFO:  launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO:  finished vacuuming "test.public.vacuum_fix": index scans: 1
...
...
buffer usage: 29343 hits, 9580 misses in the previous version, 14165
misses in the patched version, 14262 dirtied

Patched version counts 14165 misses but the previous version counts
9580 misses in this specific example.

--
Regards,
Nazir Bilal Yavuz
Microsoft
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 84cc983b6e6..582973d575b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -309,6 +309,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	PgStat_Counter startreadtime = 0,
 startwritetime = 0;
 	WalUsage	startwalusage = pgWalUsage;
+	int64		StartPageMiss = VacuumPageMiss;
 	BufferUsage startbufferusage = pgBufferUsage;
 	ErrorContextCallback errcallback;
 	char	  **indnames = NULL;
@@ -606,6 +607,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			StringInfoData buf;
 			char	   *msgfmt;
 			int32		diff;
+			int64		PageMissOp = VacuumPageMiss - StartPageMiss;
 			double		read_rate = 0,
 		write_rate = 0;
 
@@ -748,8 +750,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			appendStringInfo(, _("avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"),
 			 read_rate, write_rate);
 			appendStringInfo(,
-			 _("buffer usage: %lld hits, %lld misses, %lld dirtied\n"),
+			 _("buffer usage: %lld hits, %lld misses in the previous version, %lld misses in the patched version, %lld dirtied\n"),
 			 (long long) (bufferusage.shared_blks_hit + bufferusage.local_blks_hit),
+			 (long long) (PageMissOp),
 			 (long long) (bufferusage.shared_blks_read + bufferusage.local_blks_read),
 			 (long long) (bufferusage.shared_blks_dirtied + bufferusage.local_blks_dirtied));
 			appendStringInfo(,

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Alena Rybakina

Hi! I could try to check it with the test, but I want to ask you about 
details, because I'm not sure that I completely understand the test case.


You mean that we need to have two backends and on one of them we deleted 
the tuples before vacuum called the other, do you?


On 10.05.2024 13:25, Nazir Bilal Yavuz wrote:

Hi,

Thank you for working on this!

On Wed, 1 May 2024 at 06:37, Masahiko Sawada  wrote:

Thank you for further testing! I've pushed the patch.

I realized a behaviour change while looking at 'Use pgBufferUsage for
block reporting in analyze' thread [1]. Since that change applies here
as well, I thought it is better to mention it here.

Before this commit, VacuumPageMiss did not count the blocks if its
read was already completed by other backends [2]. Now,
'pgBufferUsage.local_blks_read + pgBufferUsage.shared_blks_read'
counts every block attempted to be read; possibly double counting if
someone else has already completed the read. I do not know which
behaviour is correct but I wanted to mention this.

[1] 
https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com

[2] In the WaitReadBuffers() function, see comment:
 /*
  * Skip this block if someone else has already completed it.  If an
  * I/O is already in progress in another backend, this will wait for
  * the outcome: either done, or something went wrong and we will
  * retry.
  */


--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Fix parallel vacuum buffer usage reporting

2024-05-10 Thread Nazir Bilal Yavuz

Hi,

Thank you for working on this!

On Wed, 1 May 2024 at 06:37, Masahiko Sawada  wrote:
>
> Thank you for further testing! I've pushed the patch.

I realized a behaviour change while looking at 'Use pgBufferUsage for
block reporting in analyze' thread [1]. Since that change applies here
as well, I thought it is better to mention it here.

Before this commit, VacuumPageMiss did not count the blocks if its
read was already completed by other backends [2]. Now,
'pgBufferUsage.local_blks_read + pgBufferUsage.shared_blks_read'
counts every block attempted to be read; possibly double counting if
someone else has already completed the read. I do not know which
behaviour is correct but I wanted to mention this.

[1] 
https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com

[2] In the WaitReadBuffers() function, see comment:
/*
 * Skip this block if someone else has already completed it.  If an
 * I/O is already in progress in another backend, this will wait for
 * the outcome: either done, or something went wrong and we will
 * retry.
 */

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Re: Fix parallel vacuum buffer usage reporting

2024-05-08 Thread Masahiko Sawada

On Fri, May 3, 2024 at 3:41 PM Anthonin Bonnefoy
 wrote:
>
> On Wed, May 1, 2024 at 5:37 AM Masahiko Sawada  wrote:
>>
>> Thank you for further testing! I've pushed the patch.
>
> Thanks!
>
> Here is the rebased version for the follow-up patch removing VacuumPage 
> variables. Though I'm not sure if I should create a dedicated mail thread 
> since the bug was fixed and the follow-up is more of a refactoring. What do 
> you think?

I'd suggest starting a new thread or changing the subject as the
current subject no longer matches what we're discussing.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Fix parallel vacuum buffer usage reporting

2024-05-03 Thread Anthonin Bonnefoy

On Wed, May 1, 2024 at 5:37 AM Masahiko Sawada 
wrote:

> Thank you for further testing! I've pushed the patch.

Thanks!

Here is the rebased version for the follow-up patch removing VacuumPage
variables. Though I'm not sure if I should create a dedicated mail thread
since the bug was fixed and the follow-up is more of a refactoring. What
do you think?


v1-0001-Use-pgBufferUsage-for-block-reporting-in-analyze.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-04-30 Thread Alena Rybakina

Hi!

On 30.04.2024 05:18, Masahiko Sawada wrote:

On Fri, Apr 26, 2024 at 9:12 PM Alena Rybakina wrote:

Hi!

The same script was run, but using vacuum verbose analyze, and I saw the
difference again in the fifth step:
with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied

Isn't there a chance for the checkpointer to run during this time? That could
make the conditions between the two runs slightly different and explain the
change in buffer report.

[0]https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831

Looking at the script, you won't trigger the problem.

Thank you for the link I accounted it in my next experiments.

I repeated the test without processing checkpoints with a single index, and the
number of pages in the buffer used almost matched:

master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied

with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 4489
dirtied

I think you are right - the problem was interfering with the checkpoint
process, by the way I checked the first version patch. To cut a long story
short, everything is fine now with one index.

Just in case, I'll explain: I considered this case because your patch could
have impact influenced it too.

On 25.04.2024 10:17, Anthonin Bonnefoy wrote:

On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina wrote:

I tested the main postgres branch with and without your fix using a script that
was written by me. It consists of five scenarios and I made a comparison in the
logs between the original version of the master branch and the master branch
with your patch:

Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)

The reporting issue will only happen if there's a parallel index vacuum and it
will only happen if there's at least 2 indexes [0]. You will need to create an
additional index.

Speaking of the problem, I added another index and repeated the test and found
a significant difference:

I found it when I commited the transaction (3):

master: 2964 hits, 0 misses, 0 dirtied

with applied patch v4 version: buffer usage: 33013 hits, 0 misses, 3 dirtied

When I deleted all the data from the table and later started vacuum verbose
again (4):

master: buffer usage: 51486 hits, 0 misses, 0 dirtied

with applied patch v4 version:buffer usage: 77924 hits, 0 misses, 0 dirtied

when I inserted 1 million data into the table and updated it (5):

master:buffer usage: 27904 hits, 5021 misses, 1777 dirtied

with applied patch v4 version:buffer usage: 41051 hits, 9973 misses, 2564
dirtied

As I see, the number of pages is significantly more than it was in the master
branch and ,frankly, I couldn't fully figure out if it was a mistake or not.

I think that the patch fixes the problem correctly.

I've run pgindent and updated the commit message. I realized that
parallel vacuum was introduced in pg13 but buffer usage reporting in
VACUUM command was implemented in pg15. Therefore, in pg13 and pg14,
VACUUM (PARALLEL) is available but VACUUM (PARALLEL, VERBOSE) doesn't
show the buffer usage report. Autovacuum does show the buffer usage
report but parallel autovacuum is not supported. Therefore, we should
backpatch it down to 15, not 13.

I'm going to push the patch down to pg15, barring any objections.

Regards,
I agree with you about porting and I saw that the patch is working
correctly.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: Fix parallel vacuum buffer usage reporting

2024-04-30 Thread Masahiko Sawada

On Tue, Apr 30, 2024 at 3:34 PM Anthonin Bonnefoy
 wrote:
>
> I've done some additional tests to validate the reported numbers. Using 
> pg_statio, it's possible to get the minimum number of block hits (Full script 
> attached).
>
> -- Save block hits before vacuum
> SELECT pg_stat_force_next_flush();
> SELECT heap_blks_hit, idx_blks_hit FROM pg_statio_all_tables where 
> relname='vestat' \gset
> vacuum (verbose, index_cleanup on) vestat;
> -- Check the difference
> SELECT pg_stat_force_next_flush();
> SELECT heap_blks_hit - :heap_blks_hit as delta_heap_hit,
>idx_blks_hit - :idx_blks_hit as delta_idx_hit,
>heap_blks_hit - :heap_blks_hit + idx_blks_hit - :idx_blks_hit as sum
> FROM pg_statio_all_tables where relname='vestat';
>
> Output:
> ...
> buffer usage: 14676 hits, 0 misses, 667 dirtied
> buffer usage (new): 16081 hits, 0 misses, 667 dirtied
> ...
>  -[ RECORD 1 ]--+--
> delta_heap_hit | 9747
> delta_idx_hit  | 6325
> sum| 16072
>
> From pg_statio, we had 16072 blocks for the relation + indexes.
> Pre-patch, we are under reporting with 14676.
> Post-patch, we have 16081. The 9 additional block hits come from vacuum 
> accessing catalog tables like pg_class or pg_class_oid_index.
>

Thank you for further testing! I've pushed the patch.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Fix parallel vacuum buffer usage reporting

2024-04-30 Thread Anthonin Bonnefoy

I've done some additional tests to validate the reported numbers. Using
pg_statio, it's possible to get the minimum number of block hits (Full
script attached).

-- Save block hits before vacuum
SELECT pg_stat_force_next_flush();
SELECT heap_blks_hit, idx_blks_hit FROM pg_statio_all_tables where
relname='vestat' \gset
vacuum (verbose, index_cleanup on) vestat;
-- Check the difference
SELECT pg_stat_force_next_flush();
SELECT heap_blks_hit - :heap_blks_hit as delta_heap_hit,
   idx_blks_hit - :idx_blks_hit as delta_idx_hit,
   heap_blks_hit - :heap_blks_hit + idx_blks_hit - :idx_blks_hit as sum
FROM pg_statio_all_tables where relname='vestat';

Output:
...
buffer usage: 14676 hits, 0 misses, 667 dirtied
buffer usage (new): 16081 hits, 0 misses, 667 dirtied
...
 -[ RECORD 1 ]--+--
delta_heap_hit | 9747
delta_idx_hit  | 6325
sum| 16072

>From pg_statio, we had 16072 blocks for the relation + indexes.
Pre-patch, we are under reporting with 14676.
Post-patch, we have 16081. The 9 additional block hits come from vacuum
accessing catalog tables like pg_class or pg_class_oid_index.


vacuum_block_with_pgstatio.sql
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-04-29 Thread Masahiko Sawada

On Fri, Apr 26, 2024 at 9:12 PM Alena Rybakina  wrote:
>
> Hi!
>>
>> The same script was run, but using vacuum verbose analyze, and I saw the 
>> difference again in the fifth step:
>> with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
>> master: buffer usage: 32346 hits, 573 misses, 1360 dirtied
>
> Isn't there a chance for the checkpointer to run during this time? That could 
> make the conditions between the two runs slightly different and explain the 
> change in buffer report.
>
> [0] 
> https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831
>
> Looking at the script, you won't trigger the problem.
>
> Thank you for the link I accounted it in my next experiments.
>
> I repeated the test without processing checkpoints with a single index, and 
> the number of pages in the buffer used almost matched:
>
> master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied
>
> with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 4489 
> dirtied
>
> I think you are right - the problem was interfering with the checkpoint 
> process, by the way I checked the first version patch. To cut a long story 
> short, everything is fine now with one index.
>
> Just in case, I'll explain: I considered this case because your patch could 
> have impact influenced it too.
>
> On 25.04.2024 10:17, Anthonin Bonnefoy wrote:
>
>
> On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina  
> wrote:
>>
>> I tested the main postgres branch with and without your fix using a script 
>> that was written by me. It consists of five scenarios and I made a 
>> comparison in the logs between the original version of the master branch and 
>> the master branch with your patch:
>
>  Hi! Thanks for the tests.
>
>> I have attached a test file (vacuum_check_logs.sql)
>
> The reporting issue will only happen if there's a parallel index vacuum and 
> it will only happen if there's at least 2 indexes [0]. You will need to 
> create an additional index.
>
> Speaking of the problem, I added another index and repeated the test and 
> found a significant difference:
>
> I found it when I commited the transaction (3):
>
> master: 2964 hits, 0 misses, 0 dirtied
>
> with applied patch v4 version: buffer usage: 33013 hits, 0 misses, 3 dirtied
>
> When I deleted all the data from the table and later started vacuum verbose 
> again (4):
>
> master: buffer usage: 51486 hits, 0 misses, 0 dirtied
>
> with applied patch v4 version:buffer usage: 77924 hits, 0 misses, 0 dirtied
>
> when I inserted 1 million data into the table and updated it (5):
>
> master:buffer usage: 27904 hits, 5021 misses, 1777 dirtied
>
> with applied patch v4 version:buffer usage: 41051 hits, 9973 misses, 2564 
> dirtied
>
> As I see, the number of pages is significantly more than it was in the master 
> branch and ,frankly, I couldn't fully figure out if it was a mistake or not.

I think that the patch fixes the problem correctly.

I've run pgindent and updated the commit message. I realized that
parallel vacuum was introduced in pg13 but buffer usage reporting in
VACUUM command was implemented in pg15. Therefore, in pg13 and pg14,
VACUUM (PARALLEL) is available but VACUUM (PARALLEL, VERBOSE) doesn't
show the buffer usage report. Autovacuum does show the buffer usage
report but parallel autovacuum is not supported. Therefore, we should
backpatch it down to 15, not 13.

I'm going to push the patch down to pg15, barring any objections.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v5-0001-Fix-parallel-vacuum-buffer-usage-reporting.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-04-26 Thread Alena Rybakina


Hi!


The same script was run, but using vacuum verbose analyze, and I
saw the difference again in the fifth step:
with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied

Isn't there a chance for the checkpointer to run during this time? 
That could make the conditions between the two runs slightly different 
and explain the change in buffer report.


[0] 
https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831


Looking at the script, you won't trigger the problem.


Thank you for the link I accounted it in my next experiments.

I repeated the test without processing checkpoints with a single index, 
and the number of pages in the buffer used almost matched:


master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied

with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 
4489 dirtied


I think you are right - the problem was interfering with the checkpoint 
process, by the way I checked the first version patch. To cut a long 
story short, everything is fine now with one index.


Just in case, I'll explain: I considered this case because your patch 
could have impact influenced it too.


On 25.04.2024 10:17, Anthonin Bonnefoy wrote:


On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina 
 wrote:


I tested the main postgres branch with and without your fix using
a script that was written by me. It consists of five scenarios and
I made a comparison in the logs between the original version of
the master branch and the master branch with your patch:

 Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)

The reporting issue will only happen if there's a parallel index 
vacuum and it will only happen if there's at least 2 indexes [0]. You 
will need to create an additional index.


Speaking of the problem, I added another index and repeated the test and 
found a significant difference:


 * I found it when I commited the transaction (3):

master: 2964hits, 0misses, 0dirtied

with applied patch v4 version: buffer usage: 33013hits, 0misses, 3dirtied

 * When I deleted all the data from the table and later started vacuum
   verbose again (4):

master: buffer usage: 51486hits, 0misses, 0dirtied

with applied patch v4 version:buffer usage: 77924hits, 0misses, 0dirtied

 * when I inserted 1 million data into the table and updated it (5):

master:buffer usage: 27904hits, 5021misses, 1777dirtied

with applied patch v4 version:buffer usage: 41051hits, 9973misses, 
2564dirtied


As I see, the number of pages is significantly more than it was in the 
master branch and ,frankly, I couldn't fully figure out if it was a 
mistake or not.


I attached a test script (vacuum_checks_logs.sql) with two indexes and 
no checkpoints, I also attached log files: the first one (vacuum_test) 
is the result of testing on the master branch, the second file with your 
applied patch (vacuum_test_v4).


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


vacuum_check_logs.sql
Description: application/sql
postgres=# \i ~/vacuum_check_logs.sql 
psql:/home/alena/vacuum_check_logs.sql:2: ERROR:  extension "dblink" does not 
exist
CREATE EXTENSION
 dblink_connect 

 OK
(1 row)

psql:/home/alena/vacuum_check_logs.sql:5: ERROR:  table "vestat" does not exist
SET
SET
CREATE TABLE
CREATE INDEX
CREATE INDEX
psql:/home/alena/vacuum_check_logs.sql:16: INFO:  vacuuming 
"postgres.public.vestat"
psql:/home/alena/vacuum_check_logs.sql:16: INFO:  finished vacuuming 
"postgres.public.vestat": index scans: 0
pages: 0 removed, 0 remain, 0 scanned (100.00% of total)
tuples: 0 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 742, which was 0 XIDs old when operation ended
new relfrozenxid: 742, which is 3 XIDs ahead of previous value
frozen: 0 pages from table (100.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (100.00% of total) had 0 dead item 
identifiers removed
I/O timings: read: 0.049 ms, write: 0.000 ms
avg read rate: 40.584 MB/s, avg write rate: 0.000 MB/s
buffer usage: 23 hits, 2 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 237 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
VACUUM
 pg_sleep 
--
 
(1 row)

INSERT 0 100
DELETE 10
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  vacuuming 
"postgres.public.vestat"
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  launched 1 parallel vacuum 
worker for index vacuuming (planned: 1)
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  table "vestat": truncated 
3922 to 3530 pages
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  finished vacuuming 
"postgres.public.vestat": index scans: 1
pages: 392 removed, 3530 remain, 3922 scanned (100.00% of total)
tuples: 10 removed, 90 remain, 0 are dead but not yet

Re: Fix parallel vacuum buffer usage reporting

2024-04-25 Thread Anthonin Bonnefoy

On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina 
wrote:

> I tested the main postgres branch with and without your fix using a script
> that was written by me. It consists of five scenarios and I made a
> comparison in the logs between the original version of the master branch
> and the master branch with your patch:
>
 Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)
>
Looking at the script, you won't trigger the problem. The reporting issue
will only happen if there's a parallel index vacuum and it will only happen
if there's at least 2 indexes [0]. You will need to create an additional
index.

The same script was run, but using vacuum verbose analyze, and I saw the
> difference again in the fifth step:
> with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
> master: buffer usage: 32346 hits, 573 misses, 1360 dirtied
>
Isn't there a chance for the checkpointer to run during this time? That
could make the conditions between the two runs slightly different and
explain the change in buffer report.

[0]
https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831

Re: Fix parallel vacuum buffer usage reporting

2024-04-24 Thread Alena Rybakina

On 22.04.2024 11:07, Anthonin Bonnefoy wrote:
On Sat, Apr 20, 2024 at 2:00 PM Alena Rybakina
wrote:

Hi, thank you for your work with this subject.

While I was reviewing your code, I noticed that your patch
conflicts with another patch [0] that been committed.

[0]

https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

I've rebased the patch and also split the changes:
1: Use pgBufferUsage in Vacuum and Analyze block reporting
2: Stop tracking buffer usage in the now unused Vacuum{Hit,Miss,Dirty}
variables

3: Remove declarations of Vacuum{Hit,Miss,Dirty}

Hi! Thank you for your work, and I have reviewed your corrections.

I tested the main postgres branch with and without your fix using a
script that was written by me. It consists of five scenarios and I made
a comparison in the logs between the original version of the master
branch and the master branch with your patch:

1. I added 1 million data to the table and deleted 10% of them. I ran
vacuum verbose and didn't see any differences.

buffer usage: 12585 hits, 0 misses, 4 dirtied
2. I opened another connection with a repeatable read transaction
through the dblink extension and launched a query updating the records
in the table under test. Later, I ran vacuum verbose again and also
didn't see any differences.

buffer usage: 19424 hits, 0 misses, 6 dirtied
3. I commited transaction and ran vacuum verbose again. Everything is
fine in the logs.

buffer usage: 22960 hits, 0 misses, 11456 dirtied
4. I deleted all the data from the table and later started vacuum
verbose again. The number of pages in the buffer matched with your patch
too.
5.I inserted 1 million data into the table and updated it, and I found
the difference between the original master version and the version with
your patch:

with your patch: buffer usage: 32315 hits, 606 misses, 1547 dirtied
original version: buffer usage: 32348 hits, 573 misses, 1456 dirtied
I suppose, something wasn't calculated.

The same script was run, but using vacuum verbose analyze, and I saw the
difference again in the fifth step:

with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied

I have attached a test file (vacuum_check_logs.sql) and four log files:
two with the vacuum verbose command (vacuum_log and
vacuum_log_with_patch files) and two others with the vacuum verbose
analyze command (vacuum_analyze_test_with_patch and vacuum_analyze_test
files). Both test approaches show logs with and without your changes.

1: Use pgBufferUsage in Vacuum and Analyze block reporting

I think that if the anayze command doesn't have the same issue, we
don't need to change it. Making the vacuum and the analyze consistent
is a good point but I'd like to avoid doing unnecessary changes in
back branches. I think the patch set would contain:

(a) make lazy vacuum use BufferUsage instead of
VacuumPage{Hit,Miss,Dirty}. (backpatched down to pg13).
(b) make analyze use BufferUsage and remove VacuumPage{Hit,Miss,Dirty}
variables for consistency and simplicity (only for HEAD, if we agree).

BTW I realized that VACUUM VERBOSE running on a temp table always
shows the number of dirtied buffers being 0, which seems to be a bug.
The patch (a) will resolve it as well.

I agree with that. I think we can leave these changes to the analysis
command for master, but I also doubt the need to backport his changes to
back versions.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
postgres=# \i ~/vacuum_check_logs.sql
psql:/home/alena/vacuum_check_logs.sql:2: ERROR: extension "dblink" does not
exist
CREATE EXTENSION
dblink_connect

OK
(1 row)

psql:/home/alena/vacuum_check_logs.sql:5: ERROR: table "vestat" does not exist
SET
SET
CREATE TABLE
CREATE INDEX
psql:/home/alena/vacuum_check_logs.sql:15: INFO: vacuuming
"postgres.public.vestat"
psql:/home/alena/vacuum_check_logs.sql:15: INFO: finished vacuuming
"postgres.public.vestat": index scans: 0
pages: 0 removed, 0 remain, 0 scanned (100.00% of total)
tuples: 0 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 741, which was 0 XIDs old when operation ended
new relfrozenxid: 741, which is 2 XIDs ahead of previous value
frozen: 0 pages from table (100.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (100.00% of total) had 0 dead item
identifiers removed
I/O timings: read: 0.100 ms, write: 0.000 ms
avg read rate: 29.260 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13 hits, 1 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 237 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
VACUUM
pg_sleep
--

(1 row)

INSERT 0 100
DELETE 10
psql:/home/alena/vacuum_check_logs.sql:24: INFO: vacuuming
"postgres.public.vestat"

Re: Fix parallel vacuum buffer usage reporting

2024-04-24 Thread Anthonin Bonnefoy

Thanks for the review!


> I think that if the anayze command doesn't have the same issue, we
> don't need to change it.

Good point, I've wrongly assumed that analyze was also impacted but there's
no parallel analyze so the block count is correct.


> (a) make lazy vacuum use BufferUsage instead of
> VacuumPage{Hit,Miss,Dirty}. (backpatched down to pg13).
> (b) make analyze use BufferUsage and remove VacuumPage{Hit,Miss,Dirty}
> variables for consistency and simplicity (only for HEAD, if we agree).
>
 I've isolated the fix in the first patch. I've kept the analyze change and
removal of VacuumPage* variables split for clarity sake.

BTW I realized that VACUUM VERBOSE running on a temp table always
> shows the number of dirtied buffers being 0, which seems to be a bug.
> The patch (a) will resolve it as well.
>
Indeed, it visible with the following test:

SET temp_buffers='1024';
CREATE TEMPORARY TABLE vacuum_fix (aid int, bid int);
INSERT INTO vacuum_fix SELECT *, * FROM generate_series(1, 100);
VACUUM vacuum_fix;
UPDATE vacuum_fix SET bid = bid+1;
VACUUM (VERBOSE, INDEX_CLEANUP ON) vacuum_fix;

Pre-patch:
  avg read rate: 254.751 MB/s, avg write rate: 0.029 MB/s
  buffer usage: 8853 hits, 8856 misses, 1 dirtied
  WAL usage: 1 records, 1 full page images, 3049 bytes
The dirtied page is from pg_class (so not a local buffer)

With the fix:
  avg read rate: 250.230 MB/s, avg write rate: 250.230 MB/s
  buffer usage: 8853 hits, 8856 misses, 8856 dirtied
  WAL usage: 1 records, 1 full page images, 3049 bytes


v4-0002-Use-pgBufferUsage-for-analyze-block-reporting.patch
Description: Binary data


v4-0001-Fix-parallel-and-temp-vacuum-buffer-usage-reporti.patch
Description: Binary data


v4-0003-Remove-unused-variables-for-vacuum-buffer-usage.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-04-24 Thread Masahiko Sawada

On Mon, Apr 22, 2024 at 5:07 PM Anthonin Bonnefoy
 wrote:
>
> On Sat, Apr 20, 2024 at 2:00 PM Alena Rybakina  
> wrote:
>>
>> Hi, thank you for your work with this subject.
>>
>> While I was reviewing your code, I noticed that your patch conflicts with 
>> another patch [0] that been committed.
>>
>> [0] 
>> https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
>
>
> I've rebased the patch and also split the changes:

Thank you for updating the patch!

> 1: Use pgBufferUsage in Vacuum and Analyze block reporting

I think that if the anayze command doesn't have the same issue, we
don't need to change it. Making the vacuum and the analyze consistent
is a good point but I'd like to avoid doing unnecessary changes in
back branches. I think the patch set would contain:

(a) make lazy vacuum use BufferUsage instead of
VacuumPage{Hit,Miss,Dirty}. (backpatched down to pg13).
(b) make analyze use BufferUsage and remove VacuumPage{Hit,Miss,Dirty}
variables for consistency and simplicity (only for HEAD, if we agree).

BTW I realized that VACUUM VERBOSE running on a temp table always
shows the number of dirtied buffers being 0, which seems to be a bug.
The patch (a) will resolve it as well.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Fix parallel vacuum buffer usage reporting

2024-04-22 Thread Anthonin Bonnefoy

On Sat, Apr 20, 2024 at 2:00 PM Alena Rybakina 
wrote:

> Hi, thank you for your work with this subject.
>
> While I was reviewing your code, I noticed that your patch conflicts with
> another patch [0] that been committed.
>
> [0]
> https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
>

I've rebased the patch and also split the changes:
1: Use pgBufferUsage in Vacuum and Analyze block reporting
2: Stop tracking buffer usage in the now unused Vacuum{Hit,Miss,Dirty}
variables
3: Remove declarations of Vacuum{Hit,Miss,Dirty}


v3-0003-Remove-declaration-of-Vacuum-block-usage-tracking.patch
Description: Binary data


v3-0001-Fix-parallel-vacuum-buffer-usage-reporting.patch
Description: Binary data


v3-0002-Remove-unused-tracking-of-vacuum-buffer-usage.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-04-20 Thread Alena Rybakina


Hi, thank you for your work with this subject.

While I was reviewing your code, I noticed that your patch conflicts 
with another patch [0] that been committed.


[0] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com


On 28.03.2024 13:12, Anthonin Bonnefoy wrote:

Hi,

Thanks for the review.
On Thu, Mar 28, 2024 at 4:07 AM Masahiko Sawada 
 wrote:


As for the proposed patch, the following part should handle the
temp tables too:


True, I've missed the local blocks. I've updated the patch:
- read_rate and write_rate now include local block usage
- I've added a specific output for reporting local blocks instead of 
combining them in the same output.


Regards,
Anthonin


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: Fix parallel vacuum buffer usage reporting

2024-03-28 Thread Anthonin Bonnefoy

Hi,

Thanks for the review.

On Thu, Mar 28, 2024 at 4:07 AM Masahiko Sawada 
wrote:

> As for the proposed patch, the following part should handle the temp
> tables too:
>

True, I've missed the local blocks. I've updated the patch:
- read_rate and write_rate now include local block usage
- I've added a specific output for reporting local blocks instead of
combining them in the same output.

Regards,
Anthonin


v2-0001-Fix-parallel-vacuum-buffer-usage-reporting.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

2024-03-27 Thread Masahiko Sawada

Hi,

Thank you for the report.

On Fri, Feb 9, 2024 at 6:10 PM Anthonin Bonnefoy
 wrote:
>
> Hi,
>
> With a db setup with pgbench, we add an additional index:
> CREATE INDEX ON pgbench_accounts(abalance)
>
> And trigger several updates and vacuum to reach a stable amount of
> dirtied pages:
> UPDATE pgbench_accounts set abalance = abalance + 1 WHERE aid=1; CHECKPOINT;
> VACUUM (VERBOSE, INDEX_CLEANUP ON) pgbench_accounts
>
> The vacuum will report the following:
> INFO:  vacuuming "postgres.public.pgbench_accounts"
> INFO:  launched 1 parallel vacuum worker for index vacuuming (planned: 1)
> INFO:  finished vacuuming "postgres.public.pgbench_accounts": index scans: 1
> ...
> buffer usage: 122 hits, 165 misses, 4 dirtied
>
> 4 pages were reported dirtied. However, we have 5 dirtied blocks at
> the end of the vacuum when looking at pg_buffercache:
>
> SELECT c.relname, b.relfilenode
>  FROM
> pg_buffercache b LEFT JOIN pg_class c
>  ON b.relfilenode =
> pg_relation_filenode(c.oid)
>WHERE isdirty=true;
> relname| relfilenode
> ---+-
>  pg_class  |1259
>  pgbench_accounts  |   16400
>  pgbench_accounts  |   16400
>  pgbench_accounts_pkey |   16408
>  pgbench_accounts_abalance_idx |   16480
>
> The missing dirty block comes from the parallel worker vacuuming the
> abalance index. Running vacuum with parallel disabled will give the
> correct result.
>
> Vacuum uses dedicated VacuumPage{Hit,Miss,Dirty} globals to track
> buffer usage. However, those values are not collected at the end of
> parallel vacuum workers, leading to incorrect buffer count.

True. I think we should fix it also on backbranches.

>
> Those vacuum specific globals are redundant with the existing
> pgBufferUsage and only used in the verbose output. This patch removes
> them and replaces them by pgBufferUsage which is already correctly
> collected at the end of parallel workers, fixing the buffer count.

It seems to make sense to remove VacuumPageHit and friends, only on
the master branch, if we can use BufferUsage instead.

As for the proposed patch, the following part should handle the temp tables too:

appendStringInfo(, _("avg read rate: %.3f
MB/s, avg write rate: %.3f MB/s\n"),
 read_rate, write_rate);
appendStringInfo(, _("buffer usage: %lld
hits, %lld misses, %lld dirtied\n"),
-(long long)
AnalyzePageHit,
-(long long)
AnalyzePageMiss,
-(long long)
AnalyzePageDirty);
+(long long)
bufferusage.shared_blks_hit,
+(long long)
bufferusage.shared_blks_read,
+(long long)
bufferusage.shared_blks_dirtied);
appendStringInfo(, _("system usage: %s"),
pg_rusage_show());

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Fix parallel vacuum buffer usage reporting

2024-02-09 Thread Anthonin Bonnefoy

Hi,

With a db setup with pgbench, we add an additional index:
CREATE INDEX ON pgbench_accounts(abalance)

And trigger several updates and vacuum to reach a stable amount of
dirtied pages:
UPDATE pgbench_accounts set abalance = abalance + 1 WHERE aid=1; CHECKPOINT;
VACUUM (VERBOSE, INDEX_CLEANUP ON) pgbench_accounts

The vacuum will report the following:
INFO:  vacuuming "postgres.public.pgbench_accounts"
INFO:  launched 1 parallel vacuum worker for index vacuuming (planned: 1)
INFO:  finished vacuuming "postgres.public.pgbench_accounts": index scans: 1
...
buffer usage: 122 hits, 165 misses, 4 dirtied

4 pages were reported dirtied. However, we have 5 dirtied blocks at
the end of the vacuum when looking at pg_buffercache:

SELECT c.relname, b.relfilenode
 FROM
pg_buffercache b LEFT JOIN pg_class c
 ON b.relfilenode =
pg_relation_filenode(c.oid)
   WHERE isdirty=true;
relname| relfilenode
---+-
 pg_class  |1259
 pgbench_accounts  |   16400
 pgbench_accounts  |   16400
 pgbench_accounts_pkey |   16408
 pgbench_accounts_abalance_idx |   16480

The missing dirty block comes from the parallel worker vacuuming the
abalance index. Running vacuum with parallel disabled will give the
correct result.

Vacuum uses dedicated VacuumPage{Hit,Miss,Dirty} globals to track
buffer usage. However, those values are not collected at the end of
parallel vacuum workers, leading to incorrect buffer count.

Those vacuum specific globals are redundant with the existing
pgBufferUsage and only used in the verbose output. This patch removes
them and replaces them by pgBufferUsage which is already correctly
collected at the end of parallel workers, fixing the buffer count.

Regards,
Anthonin


v01-0001-Fix-parallel-vacuum-buffer-usage-reporting.patch
Description: Binary data

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Re: Fix parallel vacuum buffer usage reporting

Fix parallel vacuum buffer usage reporting

23 matches

Site Navigation

Mail list logo

Footer information