[HACKERS] CLUSTER command progress monitor

Tatsuro Yamada Wed, 30 Aug 2017 19:14:53 -0700

Hi,

Following is a proposal for reporting the progress of CLUSTER command:


It seems that the following could be the phases of CLUSTER processing:

  1. scanning heap
  2. sort tuples
  3. writing new heap
  4. scan heap and write new heap
  5. swapping relation files
  6. rebuild index
  7. performing final cleanup

These phases are based on Rahila's presentation at PGCon 2017
 
(https://www.pgcon.org/2017/schedule/attachments/472_Progress%20Measurement%20PostgreSQL.pdf)
and I added some phases on it.

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

  * Seq Scan
    1. scanning heap
    2. sort tuples
    3. writing new heap
    5. swapping relation files
    6. rebuild index
    7. performing final cleanup

  * Index Scan
    4. scan heap and write new heap
    5. swapping relation files
    6. rebuild index
    7. performing final cleanup

The view provides the information of CLUSTER command progress details as follows
postgres=# \d pg_stat_progress_cluster
           View "pg_catalog.pg_stat_progress_cluster"
       Column        |  Type   | Collation | Nullable | Default
---------------------+---------+-----------+----------+---------
 pid                 | integer |           |          |
 datid               | oid     |           |          |
 datname             | name    |           |          |
 relid               | oid     |           |          |
 phase               | text    |           |          |
 scan_method         | text    |           |          |
 scan_index_relid    | bigint  |           |          |
 heap_tuples_total   | bigint  |           |          |
 heap_tuples_scanned | bigint  |           |          |


Then I have questions.

  * Should we have separate views for them?  Or should both be covered by the
    same view with some indication of which command (CLUSTER or VACUUM FULL)
    is actually running?
    I mean this progress monitor could be covering not only CLUSTER command but 
also
    VACUUM FULL command.

  * I chose tuples as scan heap's counter (heap_tuples_scanned) since it's not
    easy to get current blocks from Index Scan. Is it Ok?


I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Regards,
Tatsuro Yamada
NTT Open Source Software Center

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5575c2c..18fe2c6 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3229,9 +3237,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</> and <command>CLUSTER</>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3423,6 +3431,157 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</></entry>
+     <entry><type>integer</></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</></entry>
+     <entry><type>name</></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>phase</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Current processing phase of cluster.  See <xref linkend='cluster-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_index_relid</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</>, 
+       <literal>writing new heap</> and <literal>scan heap and write new heap</>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</> is rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dc40cde..c10c830 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -899,6 +899,30 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param2 WHEN 0 THEN 'index scan'
+                      WHEN 1 THEN 'seq scan'
+                      END AS scan_method,
+        S.param3 AS scan_index_relid,
+        S.param4 AS heap_tuples_total,
+        S.param5 AS heap_tuples_scanned
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 48f1e6e..8f2a473 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -276,6 +279,11 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	if (!OldHeap)
 		return;
 
+	/* Start progress monitor for cluster command */
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	/* Set indexOid to column */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_INDEX_RELID, indexOid);
+
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
 	 * that the relation still is what we think it is.
@@ -404,6 +412,8 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does heap_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -771,6 +781,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -902,12 +915,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1039,6 +1056,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1052,8 +1072,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1064,10 +1091,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1480,6 +1510,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1514,6 +1547,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1529,6 +1566,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 20ce48b..90bde85 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -467,6 +467,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9472ecc..28ccf38 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,24 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_PHASE					0
+#define PROGRESS_CLUSTER_SCAN_METHOD			1
+#define PROGRESS_CLUSTER_SCAN_INDEX_RELID		2
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	4
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		0
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		1
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index cb05d9b..1c6d5c7 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -915,7 +915,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d582bc9..cacece5 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1841,6 +1841,31 @@ pg_stat_progress_vacuum| SELECT s.pid,
     s.param7 AS num_dead_tuples
    FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_cluster| SELECT
+    s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE S.param2
+            WHEN 0 THEN 'index scan'
+            WHEN 1 THEN 'seq scan'
+            END AS scan_method,
+    s.param3 AS index_relid,
+    s.param4 AS heap_blks_total,
+    s.param5 AS heap_blks_scanned
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_replication| SELECT s.pid,
     s.usesysid,
     u.rolname AS usename,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CLUSTER command progress monitor

Reply via email to