Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-28 Thread Tom Lane
Justin Pryzby  writes:
> I think the attached is 80% complete (I didn't touch pg_dump).
> One objection to this change would be that all relations (including indices)
> end up with relclustered fields, and pg_index already has a number of bools, 
> so
> it's not like this one bool is wasting a byte.
> I think relisclustered was a's clever way of avoiding that overhead 
> (c0ad5953).
> So I would be -0.5 on moving it to pg_class..
> But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
> somewhere else.

0001 has been superseded by events (faade5d4c), so the cfbot is choking
on that one's failure to apply, and not testing any further.  Please
repost without 0001 so that we can get this testing again.

regards, tom lane




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-16 Thread Justin Pryzby
On Mon, Feb 17, 2020 at 02:31:42PM +0900, Amit Langote wrote:
> Hi Justin,
> 
> On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby  wrote:
> > On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> > > On 2020-Feb-06, Justin Pryzby wrote:
> > >
> > > > I wondered if it wouldn't be better if CLUSTER ON was stored in 
> > > > pg_class as the
> > > > Oid of a clustered index, rather than a boolean in pg_index.
> > >
> > > Maybe.  Do you want to try a patch?
> >
> > I think the attached is 80% complete (I didn't touch pg_dump).
> >
> > One objection to this change would be that all relations (including indices)
> > end up with relclustered fields, and pg_index already has a number of 
> > bools, so
> > it's not like this one bool is wasting a byte.
> >
> > I think relisclustered was a's clever way of avoiding that overhead 
> > (c0ad5953).
> > So I would be -0.5 on moving it to pg_class..

In case there's any confusion: "a's" was probably me halfway changing
"someone's" to "a".

> Are you still for fixing ALTER TABLE losing relisclustered with the
> patch we were working on earlier [1], if not for moving relisclustered
> to pg_class anymore?

Thanks for remembering this one.

I think your patch is the correct fix.

I forgot to say it, but moving relisclustered to pg_class doesn't help to avoid
losting indisclustered: it still needs a fix just like this.  Anyway, I
withdrew my suggestion for moving to pg_class, since it has more overhead, even
for pg_class rows for relations which can't have indexes.

> I have read elsewhere [2] that forcing ALTER TABLE to rewrite in
> clustered order might not be a good option, but maybe that one is a
> more radical proposal than this.

Right; your fix seems uncontroversial.  I ran into this (indisclustered) bug
while starting to write that patch for "ALTER rewrite in clustered order".

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-16 Thread Amit Langote
Hi Justin,

On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby  wrote:
> On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> > On 2020-Feb-06, Justin Pryzby wrote:
> >
> > > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class 
> > > as the
> > > Oid of a clustered index, rather than a boolean in pg_index.
> >
> > Maybe.  Do you want to try a patch?
>
> I think the attached is 80% complete (I didn't touch pg_dump).
>
> One objection to this change would be that all relations (including indices)
> end up with relclustered fields, and pg_index already has a number of bools, 
> so
> it's not like this one bool is wasting a byte.
>
> I think relisclustered was a's clever way of avoiding that overhead 
> (c0ad5953).
> So I would be -0.5 on moving it to pg_class..

Are you still for fixing ALTER TABLE losing relisclustered with the
patch we were working on earlier [1], if not for moving relisclustered
to pg_class anymore?

I have read elsewhere [2] that forcing ALTER TABLE to rewrite in
clustered order might not be a good option, but maybe that one is a
more radical proposal than this.

Thanks,
Amit

[1] 
https://postgr.es/m/CA%2BHiwqEt1HnXYckCdaO8%2BpOoFs7NNS5byoZ6Xg2B7epKbhS85w%40mail.gmail.com
[2] https://postgr.es/m/10984.1581181029%40sss.pgh.pa.us




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-07 Thread Justin Pryzby
On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> On 2020-Feb-06, Justin Pryzby wrote:
> 
> > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as 
> > the
> > Oid of a clustered index, rather than a boolean in pg_index.
> 
> Maybe.  Do you want to try a patch?

I think the attached is 80% complete (I didn't touch pg_dump).

One objection to this change would be that all relations (including indices)
end up with relclustered fields, and pg_index already has a number of bools, so
it's not like this one bool is wasting a byte.

I think relisclustered was a's clever way of avoiding that overhead (c0ad5953).
So I would be -0.5 on moving it to pg_class..

But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
somewhere else.
>From 7eea0a17e495fe13379ffd589b551f2f145f5672 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 6 Feb 2020 21:48:13 -0600
Subject: [PATCH v1 1/3] Update comment obsolete since b9b8831a

---
 src/backend/commands/cluster.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e9d7a7f..3adcbeb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1539,9 +1539,9 @@ get_tables_to_cluster(MemoryContext cluster_context)
 
 	/*
 	 * Get all indexes that have indisclustered set and are owned by
-	 * appropriate user. System relations or nailed-in relations cannot ever
-	 * have indisclustered set, because CLUSTER will refuse to set it when
-	 * called with one of them as argument.
+	 * appropriate user. Shared relations cannot ever have indisclustered
+	 * set, because CLUSTER will refuse to set it when called with one as
+	 * an argument.
 	 */
 	indRelation = table_open(IndexRelationId, AccessShareLock);
 	ScanKeyInit(,
-- 
2.7.4

>From 4777be522a7aa8b8c77b13f765cbd02043438f2a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 7 Feb 2020 08:12:50 -0600
Subject: [PATCH v1 2/3] Give developer a helpful kick in the pants if they
 change natts in one place but not another

---
 src/backend/bootstrap/bootstrap.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index bfc629c..d5e1888 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -25,7 +25,9 @@
 #include "access/xlog_internal.h"
 #include "bootstrap/bootstrap.h"
 #include "catalog/index.h"
+#include "catalog/pg_class.h"
 #include "catalog/pg_collation.h"
+#include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "common/link-canary.h"
 #include "libpq/pqsignal.h"
@@ -49,6 +51,7 @@
 #include "utils/ps_status.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
+#include "utils/syscache.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
 
@@ -602,6 +605,26 @@ boot_openrel(char *relname)
 	TableScanDesc scan;
 	HeapTuple	tup;
 
+	/* Check that pg_class data is consistent now, rather than failing obscurely later */
+	struct { Oid oid; int natts; }
+		checknatts[] = {
+		{RelationRelationId, Natts_pg_class,},
+		{TypeRelationId, Natts_pg_type,},
+		{AttributeRelationId, Natts_pg_attribute,},
+		{ProcedureRelationId, Natts_pg_proc,},
+	};
+
+	for (int i=0; irelnatts);
+		ReleaseSysCache(tuple);
+	}
+
 	if (strlen(relname) >= NAMEDATALEN)
 		relname[NAMEDATALEN - 1] = '\0';
 
-- 
2.7.4

>From ed886f8202486dea8069b719d35a5d0db7f3277c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 6 Feb 2020 12:56:34 -0600
Subject: [PATCH v1 3/3] Make cluster a property of table in pg_index..

..rather than of indexes in pg_index.

The only issue with this is that it makes pg_class larger, and the new column
applies not only to tables, but to indices.
---
 doc/src/sgml/catalogs.sgml |  14 +--
 src/backend/catalog/heap.c |   1 +
 src/backend/catalog/index.c|   6 --
 src/backend/commands/cluster.c | 172 +
 src/backend/commands/tablecmds.c   |   5 +-
 src/backend/utils/cache/relcache.c |   1 -
 src/bin/psql/describe.c|   4 +-
 src/include/catalog/pg_class.dat   |   2 +-
 src/include/catalog/pg_class.h |   3 +
 src/include/catalog/pg_index.h |   1 -
 10 files changed, 93 insertions(+), 116 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a10b665..8efeaff 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1752,6 +1752,13 @@ SCRAM-SHA-256$iteration count:
  
 
  
+  relclustered
+  oid
+  
+  The OID of the index last clustered, or zero
+ 
+
+ 
   relpages
   int4
   
@@ -3808,13 +3815,6 @@ SCRAM-SHA-256$iteration count:
  
 
  
-  indisclustered
-  bool
-  
-  If true, the table was last clustered on this index
- 
-
- 
   indisvalid
   bool
   
diff --git