On Tue, 24 Nov 2020 at 15:05, Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
>
> On 2020/11/21 2:32, Matthias van de Meent wrote:
> > Hi,
> >
> > The pg_stat_progress_cluster view can report incorrect
> > heap_blks_scanned values when synchronize_seqscans is enabled, because
> > it allows the sequential heap scan to not start at block 0. This can
> > result in wraparounds in the heap_blks_scanned column when the table
> > scan wraps around, and starting the next phase with heap_blks_scanned
> > != heap_blks_total. This issue was introduced with the
> > pg_stat_progress_cluster view.
>
> Good catch! I agree that this is a bug.
>
> >
> > The attached patch fixes the issue by accounting for a non-0
> > heapScan->rs_startblock and calculating the correct number with a
> > non-0 heapScan->rs_startblock in mind.
>
> Thanks for the patch! It basically looks good to me.

Thanks for the feedback!

> It's a bit waste of cycles to calculate and update the number of scanned
> blocks every cycles. So I'm inclined to change the code as follows.
> Thought?
>
> +       BlockNumber     prev_cblock = InvalidBlockNumber;
> <snip>
> +                       if (prev_cblock != heapScan->rs_cblock)
> +                       {
> +                               
> pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
> +                                                                             
>            (heapScan->rs_cblock +
> +                                                                             
>             heapScan->rs_nblocks -
> +                                                                             
>             heapScan->rs_startblock
> +                                                                             
>                    ) % heapScan->rs_nblocks + 1);
> +                               prev_cblock = heapScan->rs_cblock;
> +                       }

That seems quite reasonable.

I noticed that with my proposed patch it is still possible to go to
the next phase while heap_blks_scanned != heap_blks_total. This can
happen when the final heap pages contain only dead tuples, so no tuple
is returned from the last heap page(s) of the scan. As the
heapScan->rs_cblock is set to InvalidBlockNumber when the scan is
finished (see heapam.c#1060-1072), I think it would be correct to set
heap_blks_scanned to heapScan->rs_nblocks at the end of the scan
instead.

Please find attached a patch applying the suggested changes.

Matthias van de Meent
From b3327cace3bebdb15006834e21672fc30cb2f0bb Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgre...@gmail.com>
Date: Fri, 20 Nov 2020 16:23:59 +0100
Subject: [PATCH v2] Fix CLUSTER progress reporting of number of blocks
 scanned.

The heapScan need not start at block 0, so heapScan->rs_cblock need not be the
correct value for amount of blocks scanned. A more correct value is
 ((heapScan->rs_cblock - heapScan->rs_startblock + heapScan->rs_nblocks) %
   heapScan->rs_nblocks), as it accounts for the wraparound and the initial
offset of the heapScan.

Additionally, a heap scan need not return tuples from the last scanned page.
This means that when table_scan_getnextslot returns false, we must manually
update the heap_blks_scanned parameter to the number of blocks in the heap
scan.
---
 src/backend/access/heap/heapam_handler.c | 28 ++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index dcaea7135f..f20d4bed07 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -698,6 +698,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 	Datum	   *values;
 	bool	   *isnull;
 	BufferHeapTupleTableSlot *hslot;
+	BlockNumber prev_cblock = InvalidBlockNumber;
 
 	/* Remember if it's a system catalog */
 	is_system_catalog = IsSystemRelation(OldHeap);
@@ -793,14 +794,37 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		else
 		{
 			if (!table_scan_getnextslot(tableScan, ForwardScanDirection, slot))
+			{
+				/*
+				 * A heap scan need not return tuples for the last page it has
+				 * scanned. To ensure that heap_blks_scanned is equivalent to
+				 * total_heap_blks after the table scan phase, this parameter
+				 * is manually updated to the correct value when the table scan
+				 * finishes.
+				 */
+				pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+											 heapScan->rs_nblocks);
 				break;
+			}
 
 			/*
 			 * In scan-and-sort mode and also VACUUM FULL, set heap blocks
 			 * scanned
+			 *
+			 * Note that heapScan may start at an offset and wrap around, i.e.
+			 * rs_startblock may be >0, and rs_cblock may end with a number
+			 * below rs_startblock. To prevent showing this wraparound to the
+			 * user, we offset rs_cblock by rs_startblock (modulo rs_nblocks).
 			 */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
-										 heapScan->rs_cblock + 1);
+			if (prev_cblock != heapScan->rs_cblock)
+			{
+				pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+											 (heapScan->rs_cblock +
+											  heapScan->rs_nblocks -
+											  heapScan->rs_startblock
+											 ) % heapScan->rs_nblocks + 1);
+				prev_cblock = heapScan->rs_cblock;
+			}
 		}
 
 		tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
-- 
2.20.1

Reply via email to