Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-25 Thread Alvaro Herrera
Alvaro Herrera wrote:
> I think this is the right fix for this problem.  I wonder about
> exploring other callers of RelationGetIndexList to see who else could be
> confused ...

CLUSTER was also affected (and ALTER TABLE .. CLUSTER ON).  Patched both
and added simple tests.  Couldn't detect any other immediate problems.
Maybe UNIQUE indexes will have a similar hazard, with replica identity.

Case closed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-24 Thread Alvaro Herrera
I think this is the right fix for this problem.  I wonder about
exploring other callers of RelationGetIndexList to see who else could be
confused ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 53506fd3a9f0cb81334024bf5a0e8856fd8e5e82 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 24 Jan 2018 14:40:26 -0300
Subject: [PATCH] Ignore partitioned indexes in get_relation_info

---
 src/backend/optimizer/util/plancat.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c
index 8c60b35068..60f21711f4 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -208,6 +208,16 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, 
bool inhparent,
}
 
/*
+* Ignore partitioned indexes, since they are not 
usable for
+* queries.
+*/
+   if (indexRelation->rd_rel->relkind == 
RELKIND_PARTITIONED_INDEX)
+   {
+   index_close(indexRelation, NoLock);
+   continue;
+   }
+
+   /*
 * If the index is valid, but cannot yet be used, 
ignore it; but
 * mark the plan we are generating as transient. See
 * src/backend/access/heap/README.HOT for discussion.
-- 
2.11.0



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-19 Thread Alvaro Herrera
Hi Jesper

Jesper Pedersen wrote:

> I get
> 
> pg_restore: creating INDEX "pg_catalog.pg_authid_oid_index"
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 5493; 1259 2677 INDEX
> pg_authid_oid_index jpedersen
> pg_restore: [archiver (db)] could not execute query: ERROR:  permission
> denied: "pg_authid" is a system catalog
> Command was:
> -- For binary upgrade, must preserve pg_class oids
> SELECT 
> pg_catalog.binary_upgrade_set_next_index_pg_class_oid('2677'::pg_catalog.oid);
> 
> CREATE UNIQUE INDEX "pg_authid_oid_index" ON "pg_authid" USING "btree"
> ("oid");
> 
> during check-world.

Wow, thanks for reporting.  That's freaking weird ...  Maybe I need to
use a more restrictive condition.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-18 Thread Alvaro Herrera
Here's a patch to add pg_dump tests.  This verifies that the
CREATE INDEX on parent and partition appear, as well as ALTER INDEX ..
ATTACH PARTITION.

While doing this, I noticed a small bug: the ALTER INDEX would not be
dumped when only one of the schemas are specified to pg_dump (schema of
parent and schema of partition), but it would be if both were specified.
This patch fixes that problem too.  AFAICS the test is correct after
that fix (ie. the "like" and "unlike" sets are correct now.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From f969224f403e317e9e96b519e02d1d215c651b9b Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 18 Jan 2018 18:39:52 -0300
Subject: [PATCH] Add tests for pg_dump; fix minor bug discovered while at it

---
 src/bin/pg_dump/common.c |  5 ++-
 src/bin/pg_dump/pg_dump.c| 10 -
 src/bin/pg_dump/t/002_pg_dump.pl | 95 
 3 files changed, 107 insertions(+), 3 deletions(-)

diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 9483053680..2ec3627a68 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -349,7 +349,10 @@ flagInhTables(Archive *fout, TableInfo *tblinfo, int 
numTables,
if (find_parents)
findParentsByOid([i], inhinfo, numInherits);
 
-   /* If needed, mark the parents as interesting for 
getTableAttrs. */
+   /*
+* If needed, mark the parents as interesting for getTableAttrs
+* and getIndexes.
+*/
if (mark_parents)
{
int numParents = 
tblinfo[i].numParents;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92b29e2e5f..40beb3bd48 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -6535,8 +6535,14 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int 
numTables)
if (!tbinfo->hasindex)
continue;
 
-   /* Ignore indexes of tables whose definitions are not to be 
dumped */
-   if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
+   /*
+* Ignore indexes of tables whose definitions are not to be 
dumped.
+*
+* We also need indexes on partitioned tables which have 
partitions to
+* be dumped, in order to dump the indexes on the partitions.
+*/
+   if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) &&
+   !tbinfo->interesting)
continue;
 
if (g_verbose)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 7cf9bdadb2..6e36b73174 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -5203,6 +5203,101 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL 
WITH FUNCTION pg_catalog
section_pre_data => 1,
test_schema_plus_blobs   => 1, }, },
 
+   'CREATE INDEX ON ONLY measurement' => {
+   all_runs => 1,
+   catch_all=> 'CREATE ... commands',
+   create_order => 92,
+   create_sql   => 'CREATE INDEX ON dump_test.measurement 
(city_id, logdate);',
+   regexp => qr/^
+   \QCREATE INDEX measurement_city_id_logdate_idx ON ONLY 
measurement USING\E
+   /xm,
+   like => {
+   binary_upgrade   => 1,
+   clean=> 1,
+   clean_if_exists  => 1,
+   createdb => 1,
+   defaults => 1,
+   exclude_dump_test_schema => 1,
+   exclude_test_table   => 1,
+   exclude_test_table_data  => 1,
+   no_blobs => 1,
+   no_privs => 1,
+   no_owner => 1,
+   only_dump_test_schema=> 1,
+   pg_dumpall_dbprivs   => 1,
+   schema_only  => 1,
+   section_post_data=> 1,
+   test_schema_plus_blobs   => 1,
+   with_oids=> 1, },
+   unlike => {
+   only_dump_test_table => 1,
+   pg_dumpall_globals   => 1,
+   pg_dumpall_globals_clean => 1,
+   role => 1,
+   section_pre_data => 1, }, },
+
+   'CREATE INDEX ... ON measurement_y2006_m2' => {
+   all_runs => 1,
+  

Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-17 Thread Robert Haas
On Tue, Jan 16, 2018 at 6:08 PM, David Rowley
 wrote:
> On 17 January 2018 at 03:58, Alvaro Herrera  wrote:
>>> 9. Error details claim that p2_a_idx is not a partition of p.
>>> Shouldn't it say table "p2" is not a partition of "p"?
>>
>> You missed the "on" in the DETAIL:
>>   DETAIL:  Index "p2_a_idx" is not on a partition of table "p".
>> You could argue that this is obscurely worded, but if you look at the
>> command:
>>ALTER INDEX p_a_idx ATTACH PARTITION p2_a_idx;
>> nowhere is table p2 mentioned, so I'm not sure it's a great idea to
>> mention the table in the error message.
>
> I think I did miss the "on".

I think you will not be the only person to make that mistake.  I think
it would be better phrased as

DETAIL: "p2_a_idx" is not an index of any partition of table "p"

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-16 Thread David Rowley
On 17 January 2018 at 03:58, Alvaro Herrera  wrote:
>> 9. Error details claim that p2_a_idx is not a partition of p.
>> Shouldn't it say table "p2" is not a partition of "p"?
>
> You missed the "on" in the DETAIL:
>   DETAIL:  Index "p2_a_idx" is not on a partition of table "p".
> You could argue that this is obscurely worded, but if you look at the
> command:
>ALTER INDEX p_a_idx ATTACH PARTITION p2_a_idx;
> nowhere is table p2 mentioned, so I'm not sure it's a great idea to
> mention the table in the error message.

I think I did miss the "on".

>> 10. You've added code to get_relation_info() to handle partitioned
>> indexes, but that code is skipped [...]
>> The new code should either be removed, or you should load the index
>> list for a partitioned table.
>
> That's a leftover from previous versions too.  YAGNI principle says I
> should remove it rather than activate it, I think, since the optimizer
> is not going to use the data for anything.

Agreed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-15 Thread David Rowley
On 12 January 2018 at 07:52, Alvaro Herrera  wrote:
> The delta patch turned out to have at least one stupid bug, and a few
> non-stupid bugs.  Here's an updated version that should behave
> correctly, and addresses all reported problems.

Thanks for updating the patch.

I've just made another pass over the patch and have a few more comments.

1. Expression varattnos don't seem to get translated correctly.
drop table p;
create table p (a int, b int) partition by list (a);
create table p1 (b int, a int);
alter table p attach partition p1 for values in(1);
create index on p (abs(a));
\d p1
 Table "public.p1"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 b  | integer |   |  |
 a  | integer |   |  |
Partition of: p FOR VALUES IN (1)
Indexes:
"p1_abs_idx" btree (abs(b))

CompareIndexInfo() seems to validate existing indexes correctly.

Can you also write a test for this?

2. No stats are gathered for the partitioned expression index.

drop table p;
create table p (a int, b int) partition by range (a);
create table p1 (b int, a int);
alter table p attach partition p1 for values from (1) to (1001);
create index on p1 (abs(a));
create index on p (abs(a));
insert into p select x,x from generate_series(1,1000) x;
analyze p, p1;

select * from pg_statistic where starelid = 'p_abs_idx'::regclass;
select * from pg_statistic where starelid = 'p1_abs_idx'::regclass;

3. deadlocking: I've not yet debugged this to see if this can be avoided.

drop table p;
create table p (a int not null) partition by list (a);
create table p1 partition of p for values in(1);
insert into p1 select 1 from generate_Series(1,1000);

-- Session 1:
create index concurrently on p1 (a);

-- Session 2:
create index on p (a);

-- Session 1:
ERROR:  deadlock detected
DETAIL:  Process 10860 waits for ShareLock on virtual transaction
3/97; blocked by process 7968.
Process 7968 waits for ShareLock on relation 90492 of database 12342;
blocked by process 10860.
HINT:  See server log for query details.

4. nparts initialized twice in DefineIndex()

int nparts = partdesc->nparts;
Oid*part_oids;
TupleDesc parentDesc;
bool invalidate_parent = false;

nparts = partdesc->nparts;

5. In DefineIndex, should the following have a heap_freetuple(newtup); ?

newtup = heap_copytuple(tup);
((Form_pg_index) GETSTRUCT(newtup))->indisvalid = false;
CatalogTupleUpdate(pg_index, >t_self, newtup);
ReleaseSysCache(tup);
heap_close(pg_index, RowExclusiveLock);

6. Header comment in ReindexPartitionedIndex() claims it "obtains the
list of children and releases the lock on parent before applying
reindex on each child.", but it does not do that. It just returns an
error.

7. I see you changed StoreSingleInheritance to have the seqnumber as
int32 instead of int16, but StoreCatalogInheritance1 still has an
int16 seqNumber parameter. Maybe that should be fixed independently
and backpatched?

8. ATExecCmd: You've added an Assert() in the DETACH PARTITION case to
ensure we're working with a table, but what makes you certain that the
"else" case on the ATTACH PARTITION is always going to get a
partitioned index?

Maybe it would be better to just move the Assert()s into the function
being called?

9. Error details claim that p2_a_idx is not a partition of p.
Shouldn't it say table "p2" is not a partition of "p"?

DROP TABLE p;
CREATE TABLE p (a INT NOT NULL, b INT NOT NULL) PARTITION BY LIST (a);
CREATE TABLE p1 PARTITION OF p FOR VALUES IN(1) PARTITION BY LIST (b);
CREATE TABLE p2 PARTITION OF p1 FOR VALUES IN(1);

CREATE INDEX p_a_idx ON ONLY p (a);

CREATE INDEX p2_a_idx ON p2 (a);
ALTER INDEX p_a_idx ATTACH PARTITION p2_a_idx;
ERROR:  cannot attach index "p2_a_idx" as a partition of index "p_a_idx"
DETAIL:  Index "p2_a_idx" is not on a partition of table "p".

10. You've added code to get_relation_info() to handle partitioned
indexes, but that code is skipped due to:

/*
* Make list of indexes.  Ignore indexes on system catalogs if told to.
* Don't bother with indexes for an inheritance parent, either.
*/
if (inhparent ||

The new code should either be removed, or you should load the index
list for a partitioned table.

11. In ProcessUtilitySlow(), the following if needs to check if we're
dealing with a partitioned table:

/* Also, lock any descendant tables if recursive */
if (stmt->relation->inh)
inheritors = find_all_inheritors(relid, lockmode, NULL);

Otherwise we will (perhaps harmlessly) lock children in the following case:

DROP TABLE p;
CREATE TABLE p (a INT NOT NULL, b INT NOT NULL);
CREATE TABLE p1 (a INT NOT NULL, b INT NOT NULL) INHERITS (p);
CREATE INDEX ON p (a);

> I'll set this as Ready for Committer now.

Will set back to waiting on author until the above things are addressed.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-12 Thread Peter Eisentraut
On 1/11/18 13:52, Alvaro Herrera wrote:
> The delta patch turned out to have at least one stupid bug, and a few
> non-stupid bugs.  Here's an updated version that should behave
> correctly, and addresses all reported problems.

It seems that CompareIndexInfo() still doesn't compare indexes' operator
classes and collations.

Also, some new test cases for pg_dump would be nice.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-10 Thread David Rowley
On 11 January 2018 at 04:37, Alvaro Herrera  wrote:
> In prior incarnations of the patch, I had an if-test to prevent
> attaching invalid indexes, but I decided to remove it at some point --
> mainly thinking of attaching a partition for which a CREATE INDEX
> CONCURRENTLY was running which already had the index as invalid and was
> later expected to become valid.  I suppose that doesn't really work
> anyway because of locking considerations (you can't attach a partition
> in which CIC is concurrently running, can you).  I'll think some more
> about this case and post an updated version later.

I guess CIC will need to check if the index has a parent index when
setting indisvalid = true, and do likewise to the parent index if all
other siblings are valid.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread David Rowley
On 9 January 2018 at 09:54, Alvaro Herrera  wrote:
> I removed the pg_index.indparentidx column that previous versions add,
> and replaced it with pg_inherits rows.  This makes the code a little bit
> bulkier in a couple of places, but nothing terrible.  As a benefit,
> there's no extra index in pg_index now.

Hi Álvaro,

I've made a pass over this patch. It's mostly in pretty good shape,
but I did find a few things to note down.

1. "either partition" -> "either the partition"

+   so the partition index is dropped together with either
partition it indexes,
+   or with the parent index it is attached to.

2. In findDependentObjects(), should the following if test be below
the ReleaseDeletionLock call?

+ if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO)
+ break;
  ReleaseDeletionLock(object);

3. The following existing comment indicates that we'll be creating a
disk file for the index. Should we update this comment to say that
this is the case only for RELKIND_INDEX?

/*
* create the index relation's relcache entry and physical disk file. (If
* we fail further down, it's the smgr's responsibility to remove the disk
* file again.)
*/

Maybe:

/*
* create the index relation's relcache entry and for non-partitioned
* indexes, a physical disk file. (If we fail further down, it's the
* smgr's responsibility to remove the disk file again.)
*/

4. Extra newline

+ recordDependencyOn(, , DEPENDENCY_INTERNAL_AUTO);
+ }
+
+

5. Do you think it's out of scope to support expression indexes?

/*
* Expression indexes are currently not considered equal.  Not needed for
* current callers.
*/
if (info1->ii_Expressions != NIL || info2->ii_Expressions != NIL)
return false;

6. pg_inherits.inhseqno is int, not smallint. I understand you copied
this from StoreCatalogInheritance1(), so maybe a fix should be put in
before this patch is?

void
StoreSingleInheritance(Oid relationId, Oid parentOid, int16 seqNumber)

and

values[Anum_pg_inherits_inhseqno - 1] = Int16GetDatum(seqNumber);

7. Is the following a bug fix? If so, should it not be fixed and
backpatched, rather than sneaked in here?

@@ -10119,6 +10195,7 @@ ATExecChangeOwner(Oid relationOid, Oid
newOwnerId, bool recursing, LOCKMODE lock
  * relation, as well as its toast table (if it has one).
  */
  if (tuple_class->relkind == RELKIND_RELATION ||
+ tuple_class->relkind == RELKIND_PARTITIONED_TABLE ||

8. Missing if (attachInfos) pfree(attachInfos);

+ if (idxes)
+ pfree(idxes);
+ if (attachRelIdxs)
+ pfree(attachRelIdxs);

9. The first OR branch of the Assert will always be false, so might as
well get rid of it.

+ if (!has_superclass(idxid))
+ continue;
+
+ Assert(!has_superclass(idxid) ||
+(IndexGetRelation(get_partition_parent(idxid), false) ==
+RelationGetRelid(rel)));

10. lock -> locks:

/* we already hold lock on both tables, so this is safe: */

11. "already the right" -> "already in the right"

/* Silently do nothing if already the right state */

12. It seems to be RangeVarGetRelidExtended() that locks the partition
index, not the callback.

/* no deadlock risk: our callback above already acquired the lock */

13. We seem to only be holding an AccessShareLock on the partitioned
table. What happens if someone concurrently does ALTER TABLE
partitioned_table DETACH our_partition;?

parentTbl = relation_open(parentIdx->rd_index->indrelid, AccessShareLock);

and;
/* Make sure it indexes a partition of the other index's table */
partDesc = RelationGetPartitionDesc(parentTbl);
found = false;
for (i = 0; i < partDesc->nparts; i++)
{
if (partDesc->oids[i] == state.partitionOid)
{
found = true;
break;
}
}

Wouldn't you also need an AccessExclusiveLock to prevent another
session trying to ALTER INDEX part_index ATTACH PARTITION leaf_index;
at the same time? The concurrent session could be trying to ATTACH it
to a subpartition and this comment could be attaching to the parent.

14. validatePartitionedIndex does not verify that the index it is
checking is valid. Consider the following:

create table part (a int, b int) partition by list(a);
create table part1 partition of part for values in(1) partition by list(b)
create table part1 partition of part for values in(1) partition by list(b);
create table part2 partition of part1 for values in (1);
create index on only part1 (a);
create index on only part (a);
alter index part_a_idx attach partition part1_a_idx;
\d part
Table "public.part"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
 b  | integer |   |  |
Partition key: LIST (a)
Indexes:
"part_a_idx" btree (a) <--- should be INVALID
Number of partitions: 1 (Use \d+ to list them.)

\d part1
   Table "public.part1"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
 b  | integer |  

Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread Jesper Pedersen

Hi Alvaro,

On 01/08/2018 03:36 PM, Alvaro Herrera wrote:

Jesper Pedersen wrote:


Maybe a warning for existing indexes on the same column(s), but with a
different type, should be emitted during ATTACH, e.g.



[detach one partition, replace index with a different one, attach
partition]



Of course, this could also fall under index maintenance and no warning
emitted. Docs have "Each partition is first checked to determine whether an
equivalent index already exists," so it is covered.


Yeah, I'm of two minds about this also -- in the initial versions I had
a code comment wondering exactly about having a hash index in a
partition attached to a btree index on parent.

As another example, having a documents table with two partitions (one
"long term archival" and other "documents currently being messed with")
you could have a text search index which is GIN in the former and GiST
in the latter.  There is a performance argument for doing it that way,
so it's not merely academic.

Anyway, while I think attaching an index that doesn't match the
properties of the index on parent can be a useful feature, on the other
hand it could be surprising (you end up losing an index because it was
matched during attach that you didn't think was going to be matched).
One idea would be to have a way to specify at ATTACH time what to do
about indexes when they don't match exactly, but I think the user
interface would be pretty messy: exactly how different do you want to
allow the indexes to be?  Is an index having one more column than the
one in parent good enough?  I think the answer to this is mostly a
judgement call, and I'd rather not spend my time debating those.



Yeah, agreed - lets leave as is.

Migrating an index to another type would mean to drop the top-level 
definition anyway.


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread Alvaro Herrera
Jesper Pedersen wrote:

> Maybe a warning for existing indexes on the same column(s), but with a
> different type, should be emitted during ATTACH, e.g.

> [detach one partition, replace index with a different one, attach
> partition]

> Of course, this could also fall under index maintenance and no warning
> emitted. Docs have "Each partition is first checked to determine whether an
> equivalent index already exists," so it is covered.

Yeah, I'm of two minds about this also -- in the initial versions I had
a code comment wondering exactly about having a hash index in a
partition attached to a btree index on parent.

As another example, having a documents table with two partitions (one
"long term archival" and other "documents currently being messed with")
you could have a text search index which is GIN in the former and GiST
in the latter.  There is a performance argument for doing it that way,
so it's not merely academic.

Anyway, while I think attaching an index that doesn't match the
properties of the index on parent can be a useful feature, on the other
hand it could be surprising (you end up losing an index because it was
matched during attach that you didn't think was going to be matched).
One idea would be to have a way to specify at ATTACH time what to do
about indexes when they don't match exactly, but I think the user
interface would be pretty messy: exactly how different do you want to
allow the indexes to be?  Is an index having one more column than the
one in parent good enough?  I think the answer to this is mostly a
judgement call, and I'd rather not spend my time debating those.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread Jesper Pedersen

Hi Alvaro,

On 01/04/2018 09:30 AM, Alvaro Herrera wrote:

v11 fixes the dependency problem I mentioned in
https://postgr.es/m/20171229203818.pqxf2cyl4g2wre6h@alvherre.pgsql
and adds some test to cover that stuff.



Thanks, make check-world passes.

Maybe a warning for existing indexes on the same column(s), but with a 
different type, should be emitted during ATTACH, e.g.


-- test.sql --
CREATE TABLE test (a integer NOT NULL) PARTITION BY HASH(a);
CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

CREATE INDEX idx_test_a ON test (a);

INSERT INTO test (SELECT i FROM generate_series(1, 100) AS i);

ANALYZE;

ALTER TABLE test DETACH PARTITION test_p00;
DROP INDEX test_p00_a_idx;
CREATE INDEX test_p00_a_idx ON test_p00 USING hash (a);
ALTER TABLE test ATTACH PARTITION test_p00 FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);


-- test.sql --

Of course, this could also fall under index maintenance and no warning 
emitted. Docs have "Each partition is first checked to determine whether 
an equivalent index already exists," so it is covered.


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-05 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Jan 5, 2018 at 4:57 PM, Peter Eisentraut
>  wrote:
> > On 1/4/18 23:08, David Rowley wrote:

> >> I admit to not having had a chance to look at any code with this yet,
> >> but I'm just thinking about a case like the following.
> >>
> >> CREATE TABLE part (a INT, b INT) PARTITION BY RANGE (a);
> >> CREATE TABLE part_a1 PARTITION OF part FOR VALUES FROM (0) TO (10)
> >> PARTITION BY RANGE (b);
> >> CREATE TABLE part_a1_b1 PARTITION OF part_a1 FOR VALUES FROM (0) TO (10);
> >>
> >> CREATE INDEX ON part_a1 (a); -- sub-partition index (creates index on
> >> part_a1_b1)
> >>
> >> CREATE INDEX ON part (a); -- What do we do here?
> >>
> >> Should we:
> >>
> >> 1. Create another identical index on part_a1_b1; or
> >> 2. Allow the existing index on part_a1_b1 to have multiple parents; or
> >> 3. ERROR... (probably not)
> >
> > 4. It should adopt part_a1 and its subindexes into its hierarchy.  That
> > shouldn't be a problem under the current theory, should it?
> 
> +1.

This is what the code does today.  IIRC there's a test for this exact
scenario.  Now, part_a1_b1 is grandchild of part, not direct parent --
so there exists an intermediate relkind=I index in part_a1, and that is
the one that becomes parent of part_a1_b1, not part's directly.

Now, if you drop the index in part, then it gets dropped in all
descendants too, including part_a1_b1.  If you DETACH the partition
first, then the index remains.  All of that seems good to me, and is as
discussed previously.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-05 Thread Robert Haas
On Thu, Jan 4, 2018 at 5:01 PM, Alvaro Herrera  wrote:
> Tangentially: I didn't like very much that I added a new index to
> pg_index to support this feature.  I thought maybe it'd be better to
> change the index on indrelid to be on (indrelid,indparentidx) instead,
> but that doesn't seem great either because it bloats that index which is
> used to support common relcache operations ...
>
> (The more I think of this, the more I believe that pg_inherits is a
> better answer.  Opinions?)

I actually haven't looked at the code, but the idea that pg_inherits
is on the way out is news to me.  If that method will work, I don't
quite see why we should invent something new.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-05 Thread Peter Eisentraut
On 1/4/18 23:08, David Rowley wrote:
> On 5 January 2018 at 11:01, Alvaro Herrera  wrote:
>> (The more I think of this, the more I believe that pg_inherits is a
>> better answer.  Opinions?)
> 
> I admit to not having had a chance to look at any code with this yet,
> but I'm just thinking about a case like the following.
> 
> CREATE TABLE part (a INT, b INT) PARTITION BY RANGE (a);
> CREATE TABLE part_a1 PARTITION OF part FOR VALUES FROM (0) TO (10)
> PARTITION BY RANGE (b);
> CREATE TABLE part_a1_b1 PARTITION OF part_a1 FOR VALUES FROM (0) TO (10);
> 
> CREATE INDEX ON part_a1 (a); -- sub-partition index (creates index on
> part_a1_b1)
> 
> CREATE INDEX ON part (a); -- What do we do here?
> 
> Should we:
> 
> 1. Create another identical index on part_a1_b1; or
> 2. Allow the existing index on part_a1_b1 to have multiple parents; or
> 3. ERROR... (probably not)

4. It should adopt part_a1 and its subindexes into its hierarchy.  That
shouldn't be a problem under the current theory, should it?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-04 Thread David Rowley
On 5 January 2018 at 11:01, Alvaro Herrera  wrote:
> (The more I think of this, the more I believe that pg_inherits is a
> better answer.  Opinions?)

I admit to not having had a chance to look at any code with this yet,
but I'm just thinking about a case like the following.

CREATE TABLE part (a INT, b INT) PARTITION BY RANGE (a);
CREATE TABLE part_a1 PARTITION OF part FOR VALUES FROM (0) TO (10)
PARTITION BY RANGE (b);
CREATE TABLE part_a1_b1 PARTITION OF part_a1 FOR VALUES FROM (0) TO (10);

CREATE INDEX ON part_a1 (a); -- sub-partition index (creates index on
part_a1_b1)

CREATE INDEX ON part (a); -- What do we do here?

Should we:

1. Create another identical index on part_a1_b1; or
2. Allow the existing index on part_a1_b1 to have multiple parents; or
3. ERROR... (probably not)

I guess pg_index.indparentidx won't allow #2 to work. Some pg_inherits
arrangement should.

We don't really want to go creating indexes that we don't need to, so
I think we should probably make an effort to allow #2 to work.

Question is, how likely is the above scenario to take place?

Normally, I see customers requiring partitioning only when an existing
table grows too large for maintenance tasks to complete a reasonable
timeframe. Probably there might also come a time when each of the
partitions they've then gone and created also grows too large. So it
does not seem unrealistic to be attaching existing tables/partitioned
tables as partitions multiple times. Wanting to reuse leaf indexes for
some new higher level partition parent does seem reasonable.

This argument might be voided by if we allowed a DROP INDEX on part_a1
without dropping the leaf indexes. I've not checked what the patch
does here, but I'd imagine if the indexes are marked as parents of
that index, then they'll be dropped.

I'll go off and look at the code now...

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-04 Thread Alvaro Herrera
Peter Eisentraut wrote:
> On 1/4/18 12:00, Robert Haas wrote:
> >> The catalog representations of partitioned tables and partitioned
> >> indexes are completely different, which may or may not be desirable.
> > 
> > How so?
> 
> If someone wants to write a query, show me all the partitions of this
> table versus show me all the partitions of this index, intuitively,
> those could be the same, different only by some relkind references.

Well, this "indparentidx" stuff I came up with is a little bit strange.
Perhaps we could use pg_inherits instead, but I think the general view
is that pg_inherits is on its way out.

Tangentially: I didn't like very much that I added a new index to
pg_index to support this feature.  I thought maybe it'd be better to
change the index on indrelid to be on (indrelid,indparentidx) instead,
but that doesn't seem great either because it bloats that index which is
used to support common relcache operations ...


(The more I think of this, the more I believe that pg_inherits is a
better answer.  Opinions?)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-04 Thread Peter Eisentraut
On 1/4/18 12:00, Robert Haas wrote:
>> The catalog representations of partitioned tables and partitioned
>> indexes are completely different, which may or may not be desirable.
> 
> How so?

If someone wants to write a query, show me all the partitions of this
table versus show me all the partitions of this index, intuitively,
those could be the same, different only by some relkind references.
Currently, you'd have to write two completely different queries.  It's
not a big deal, but it's a consideration.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-04 Thread Robert Haas
On Thu, Jan 4, 2018 at 11:44 AM, Peter Eisentraut
 wrote:
> I'm not sure why this feature of automatically picking up matching
> indexes even exists.  Is it for some specific workflows or upgrade
> scenarios?  It's kind of a surprising feature in a way.

It allows you to avoid building a new indexes unnecessarily when
attaching a partition.

> The catalog representations of partitioned tables and partitioned
> indexes are completely different, which may or may not be desirable.

How so?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-03 Thread Alvaro Herrera
Peter Eisentraut wrote:
> On 12/29/17 15:38, Alvaro Herrera wrote:
> > I just realized there's a further problem in the area: when a partition
> > is detached from its parent, its indexes are not made independent of the
> > indexes on parent.  So they can't be dropped on their own (booh!); and
> > dropping the index on the former parent partitioned table drops the
> > index on the former partition also (hiss!).  I think the fix for this is
> > to delete all dependencies, and re-register normal ones.  Should be
> > straightforward, but I'm not doing it this year.
> 
> We're waiting for an updated patch in this thread, aren't we?

Dunno.  The problems reported so far are rather very minor details that
don't prevent reviewing the rest of it.  I will of course send an
updated version at some point, but there's only one me so I can either
spend the whole commitfest focused on fixing tiny issues my own patches,
or help with other stuff.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-03 Thread Peter Eisentraut
On 12/29/17 15:38, Alvaro Herrera wrote:
> I just realized there's a further problem in the area: when a partition
> is detached from its parent, its indexes are not made independent of the
> indexes on parent.  So they can't be dropped on their own (booh!); and
> dropping the index on the former parent partitioned table drops the
> index on the former partition also (hiss!).  I think the fix for this is
> to delete all dependencies, and re-register normal ones.  Should be
> straightforward, but I'm not doing it this year.

We're waiting for an updated patch in this thread, aren't we?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-02 Thread Alvaro Herrera
Jesper Pedersen wrote:

> On 12/22/2017 10:10 AM, Alvaro Herrera wrote:

> > If you have wording suggestions for the doc changes, please send them
> > along.
> 
> Maybe we should make the default index name more explicit under the 'name'
> parameter as attached.

I'm -0.2 on documenting this.  In general, I'm hesitant to promise more
than strictly needed, mostly because it's more difficult to later change
our minds.  For example, what if soon after this patch we decide to
invent a new mechanism for creating names on partitions?  Also, in
reality there are additional considerations (how do we truncate very
long names, what do we do if the name is already taken) that I would
rather not ossify in user-facing documentation.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-02 Thread Jesper Pedersen

Hi Alvaro,

On 12/29/2017 12:59 PM, Alvaro Herrera wrote:

This seems to work pretty well, much to my surprise.  I was a bit scared
of adding a new deptype, but actually the only affected code is
findDependentObjects() and the semantics of the new type is a subset of
the existing DEPTYPE_INTERNAL, so I think it's okay.  I need to fill in
its description in docs and comments though -- I left it out because the
real difference between INTERNAL and INTERNAL_AUTO is not something that
is covered by the existing description of INTERNAL, so maybe I'll need
to expand that one.

This version includes the fixes I posted as "delta" to the problems
Jesper reported, as well as fixes to the ones reported by Amit.  It's
looking pretty good to me -- I'm seeing it as a candidate to commit
early in the upcoming commitfest.



indexing.sql contains a \di command which list the owner of the db, so 
make check-world fails.


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-02 Thread Jesper Pedersen

Hi Alvaro,

On 12/22/2017 10:10 AM, Alvaro Herrera wrote:

I believe these are all fixed by the attached delta patch.



Thanks.


If you have wording suggestions for the doc changes, please send them
along.



Maybe we should make the default index name more explicit under the 
'name' parameter as attached.


Best regards,
 Jesper
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 5137fe6383..7d32c5556d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -146,7 +146,11 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] PostgreSQL chooses a
 suitable name based on the parent table's name and the indexed column
-name(s).
+name(s) using the following formular {tablename}_{columnname(s)}_{suffix}
+where {suffix} is pkey for a primary key constraint,
+key for a unique constraint, excl for an exclusion constraint,
+idx or any other kind of index, fkey for a foreign key and
+check for a check constraint.

   
  


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-29 Thread Alvaro Herrera
I just realized there's a further problem in the area: when a partition
is detached from its parent, its indexes are not made independent of the
indexes on parent.  So they can't be dropped on their own (booh!); and
dropping the index on the former parent partitioned table drops the
index on the former partition also (hiss!).  I think the fix for this is
to delete all dependencies, and re-register normal ones.  Should be
straightforward, but I'm not doing it this year.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-28 Thread Alvaro Herrera
Amit Langote wrote:

> 2. Something strange about how dependencies are managed:
> 
> create table p (a char) partition by list (a);
> create table pa partition of p for values in ('a');;
> create table pb partition of p for values in ('b');
> create index on p (a);

> drop table pa;
> ERROR:  cannot drop table pa because other objects depend on it
> DETAIL:  index p_a_idx depends on table pa
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.

Hmm, this is quite nasty and I'm not seeing any reasonable way to fix
it.  The problem is that we have an internal dep from the parent index
to the partition index, which prompts us to recurse the chase of
dependencies inverting the way the dependencies flow -- the "case 3" in
findDependentObjects:
/*
 * 3. Not all the owning objects have been visited, so
 * transform this deletion request into a delete of this
 * owning object.

The problem is that we don't want the partition index to go away unless
it's together with its parent index, except if the table which contains
it goes away first.  We have no way to specify this condition ...  Maybe
something like sticking a phony additional record in pg_depend that
causes the partition index to get added to the targetObjects list ahead
of time ... but that doesn't work either, because we still want to chase
the internal deps in that case.  Hm.

Maybe we need a new "auto internal" deptype with a mix of semantics of
the other two deptypes.  It seems a bit ugly and I'm not sure it'd work
either ... I'll try to code it tomorrow.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-25 Thread Amit Langote
On 2017/12/25 13:52, Amit Langote wrote:
> Hi Alvaro,
> 
> On 2017/12/23 0:10, Alvaro Herrera wrote:
>> I believe these are all fixed by the attached delta patch.
> 
> @@ -1676,7 +1694,12 @@ psql_completion(const char *text, int start, int end)
> "UNION SELECT 'ALL IN TABLESPACE'");
>  /* ALTER INDEX  */
>  else if (Matches3("ALTER", "INDEX", MatchAny))
> -COMPLETE_WITH_LIST5("ALTER COLUMN", "OWNER TO", "RENAME TO",
> "SET", "RESET");
> +COMPLETE_WITH_LIST7("ALTER COLUMN", "OWNER TO", "RENAME TO", "SET",
> +"RESET", "ATTACH PARTITION");
> 
> This should be COMPLETE_WITH_LIST6().

I noticed a few things I thought I'd report.  I was using the latest patch
(v8 + the delta patch).

1. The following crashes because of an Assert in ATExecCmd():

ALTER TABLE an_index DETACH PARTITION ...

It seems ATPrepCmd() should not pass ATT_INDEX to ATSimplePermissions()
for the ATDetachPartition sub-command type.

2. Something strange about how dependencies are managed:

create table p (a char) partition by list (a);
create table pa partition of p for values in ('a');;
create table pb partition of p for values in ('b');
create index on p (a);

\d pa
   Table "public.pa"
 Column | Type | Collation | Nullable | Default
|---+--+---+--+-
 a  | character(1) |   |  |
Partition of: p FOR VALUES IN ('a')
Indexes:
"pa_a_idx" btree (a)

drop table pa;
ERROR:  cannot drop table pa because other objects depend on it
DETAIL:  index p_a_idx depends on table pa
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

set client_min_messages to debug2;

drop table pa;
DEBUG:  drop auto-cascades to index pa_a_idx
DEBUG:  drop auto-cascades to index pb_a_idx
DEBUG:  drop auto-cascades to type pa
DEBUG:  drop auto-cascades to type pa[]
ERROR:  cannot drop table pa because other objects depend on it
DETAIL:  index p_a_idx depends on table pa
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

Thanks,
Amit




Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-22 Thread Alvaro Herrera
Hello Jesper,

Jesper Pedersen wrote:

> Passes check-world here too w/ TAP + cassert.

Great, thanks for checking.

> index.c:
>  [and a few other comments]

I believe these are all fixed by the attached delta patch.

If you have wording suggestions for the doc changes, please send them
along.

Thanks for reading,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 24ec874da257cf617359cce253ff4f8ad33166d7 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 22 Dec 2017 11:41:15 -0300
Subject: [PATCH] some fixes per Jesper

---
 doc/src/sgml/ref/create_index.sgml | 28 +---
 doc/src/sgml/ref/reindex.sgml  |  5 +
 src/backend/catalog/index.c|  2 +-
 src/bin/psql/tab-complete.c|  7 ++-
 4 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml 
b/doc/src/sgml/ref/create_index.sgml
index b7b0506d6d..5137fe6383 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -557,17 +557,23 @@ Indexes:
 
   
When CREATE INDEX is invoked on a partitioned
-   table, the default behavior is to recurse to all partitions to
-   ensure they all have a matching indexes.  Each partition is first
-   checked to determine whether equivalent index already exists,
-   and if so, that index will become attached as a partition index to
-   the index being created, which will be its parent index.  If no
-   matching index exists, a new index will be created and automatically
-   attached.  If ONLY is specified, no recursion
-   is done.  Note, however, that any partition that is created in the
-   future using CREATE TABLE .. PARTITION OF will
-   automatically contain the index regardless of whether this option
-   was specified.
+   table, the default behavior is to recurse to all partitions to ensure
+   they all have matching indexes.
+   Each partition is first checked to determine whether an equivalent
+   index already exists, and if so, that index will become attached as a
+   partition index to the index being created, which will become its
+   parent index.
+   If no matching index exists, a new index will be created and
+   automatically attached; the name of the new index in each partition
+   will be determined as if no index name had been specified in the
+   command.
+   If the ONLY option is specified, no recursion
+   is done, and the index is marked invalid
+   (ALTER INDEX ... ATTACH PARTITION turns the index
+   valid, once all partitions acquire the index.)  Note, however, that
+   any partition that is created in the future using
+   CREATE TABLE ... PARTITION OF will automatically
+   contain the index regardless of whether this option was specified.
   
 
   
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 79f6931c6a..1c21fafb80 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -231,6 +231,11 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | 
DATABASE | SYSTEM } 
 
+  
+   Reindexing partitioned tables or partitioned indexes is not supported.
+   Each individual partition can be reindexed separately instead.
+  
+
  
 
  
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a62fe158ce..eee16dde25 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -629,7 +629,7 @@ UpdateIndexRelation(Oid indexoid,
 
values[Anum_pg_index_indexrelid - 1] = ObjectIdGetDatum(indexoid);
values[Anum_pg_index_indrelid - 1] = ObjectIdGetDatum(heapoid);
-   values[Anum_pg_index_indparentidx - 1 ] = 
ObjectIdGetDatum(parentIndexOid);
+   values[Anum_pg_index_indparentidx - 1] = 
ObjectIdGetDatum(parentIndexOid);
values[Anum_pg_index_indnatts - 1] = 
Int16GetDatum(indexInfo->ii_NumIndexAttrs);
values[Anum_pg_index_indisunique - 1] = 
BoolGetDatum(indexInfo->ii_Unique);
values[Anum_pg_index_indisprimary - 1] = BoolGetDatum(primary);
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6af773f1bf..159ce4129e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1695,14 +1695,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER INDEX  */
else if (Matches3("ALTER", "INDEX", MatchAny))
COMPLETE_WITH_LIST7("ALTER COLUMN", "OWNER TO", "RENAME TO", 
"SET",
-   "RESET", "ATTACH 
PARTITION", "DETACH PARTITION");
-   else if (Matches4("ALTER", "INDEX", MatchAny, "ATTACH|DETACH"))
+   "RESET", "ATTACH 
PARTITION");
+   else if (Matches4("ALTER", "INDEX", MatchAny, "ATTACH"))
COMPLETE_WITH_CONST("PARTITION");
else if (Matches5("ALTER", "INDEX", MatchAny, "ATTACH", "PARTITION"))

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-21 Thread Jesper Pedersen

Hi Alvaro,

On 12/20/2017 04:25 PM, Alvaro Herrera wrote:

I modified the regression test so that a partitioning hierarchy would be
left behind after the test is run, which is useful to test pg_upgrade
and pg_dump -- this caught one small bug.  That and some reading of the
diff resulted in v8, attach.

On my system, make check-world passes.  However, Thomas Munro's
automated patch tester seems to have a problem with the pg_upgrade test,
though I don't know what it is.



Passes check-world here too w/ TAP + cassert.

index.c:

+   values[Anum_pg_index_indparentidx - 1 ] = 
ObjectIdGetDatum(parentIndexOid);


Extra space.

tab-complete.c:

Contains references to DETACH.

create_index.sgml:

I think the new paragraph should mention the naming convention of the 
generated indexes, as they may differ from the index name on the 
partition table.


reindex.sgml:

Missing a note about REINDEX not being supported on the partition index.

Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-20 Thread Alvaro Herrera
Just to show what I'm talking about, here are a few prototype patches.
Beware, these are all very lightly tested/reviewed.  Input is welcome,
particularly if it comes in the guise of patches to regression tests
showing cases that misbehave.

0001 is a fixup for the v6 patch I posted upthread; it's needed so the
0002 patch doesn't end up with indexes marked invalid after pg_dump.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From bd26504debb14567de780da10bf859b205326b46 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 20 Dec 2017 16:31:05 -0300
Subject: [PATCH v7 1/4] Fixup! Local partitioned indexes v6

Don't actually mark an index invalid if it's on a plain table,
rather than a partition, even with ONLY specified.
---
 src/backend/commands/indexcmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index a757f05dd5..e925351056 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -723,7 +723,7 @@ DefineIndex(Oid relationId,
flags |= INDEX_CREATE_PARTITIONED;
if (stmt->primary)
flags |= INDEX_CREATE_IS_PRIMARY;
-   if (stmt->relation && !stmt->relation->inh)
+   if (partitioned && stmt->relation && !stmt->relation->inh)
flags |= INDEX_CREATE_INVALID;
 
if (stmt->deferrable)
-- 
2.11.0

>From b10f9afbe4b1c46c7a2706c70a398307732ac9fc Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 6 Nov 2017 17:04:55 +0100
Subject: [PATCH v7 2/4] allow indexes on partitioned tables to be unique

---
 src/backend/commands/indexcmds.c   |  71 +--
 src/backend/parser/parse_utilcmd.c |  24 -
 src/test/regress/expected/alter_table.out  |   8 --
 src/test/regress/expected/create_table.out |  12 ---
 src/test/regress/expected/indexing.out | 142 -
 src/test/regress/sql/alter_table.sql   |   2 -
 src/test/regress/sql/create_table.sql  |   8 --
 src/test/regress/sql/indexing.sql  |  73 ++-
 8 files changed, 274 insertions(+), 66 deletions(-)

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e925351056..86dcab6136 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -426,20 +426,11 @@ DefineIndex(Oid relationId,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot create index on 
partitioned table \"%s\" concurrently",

RelationGetRelationName(rel;
-   if (stmt->unique)
-   ereport(ERROR,
-   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-errmsg("cannot create unique index on 
partitioned table \"%s\"",
-   
RelationGetRelationName(rel;
if (stmt->excludeOpNames)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot create exclusion 
constraints on partitioned table \"%s\"",

RelationGetRelationName(rel;
-   if (stmt->primary || stmt->isconstraint)
-   ereport(ERROR,
-   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-errmsg("cannot create constraints on 
partitioned tables")));
}
 
/*
@@ -637,6 +628,68 @@ DefineIndex(Oid relationId,
index_check_primary_key(rel, indexInfo, is_alter_table);
 
/*
+* If this table is partitioned and we're creating a unique index or a
+* primary key, make sure that the indexed columns are part of the
+* partition key.  Otherwise it would be possible to violate uniqueness 
by
+* putting values that ought to be unique in different partitions.
+*
+* We could lift this limitation if we had global indexes, but those 
have
+* their own problems, so this is a useful feature combination.
+*/
+   if (partitioned && (stmt->unique || stmt->primary))
+   {
+   PartitionKey key = rel->rd_partkey;
+   int i;
+
+   /*
+* A partitioned table can have unique indexes, as long as all 
the
+* columns in the partition key appear in the unique key.  A
+* partition-local index can enforce global uniqueness iff the 
PK
+* value completely determines the partition that a row is in.
+*
+   

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-20 Thread Alvaro Herrera
Robert Haas wrote:

> Sounds great!  I made the comment during my talk at PGCONF.EU that
> partitioned tables in PostgreSQL are really just a bunch of tables
> glued together, but that over time "we'll make the glue better", and I
> think the improvements on which you are working will go a long way in
> that direction.

Thanks -- yes, that's my intention, to improve our partitioning story.
Many other projects related to improving the glue are already underway
by others, so I think we're making great progress on the overall
partitioning front.

> With regard to indexes, presumably we can only have a unique index if
> the index definition matches the partition key definition.  Otherwise,
> we can only guarantee per-partition uniqueness, unless (1) we
> implement "global" indexes, which sounds like a patch that would take
> a fair amount of work, or (2) every index insertion goes and checks
> for conflicting values in all of the "sibling" indexes.  This latter
> approach was suggested to me by Andres, and certainly solves a bunch
> of problems with vacuuming and dropping indexes, but seems like it
> would probably have lousy insert performance unless the number of
> partitions is very small.

Actually, we discussed unique constraints before.  What I proposed is to
implement a useful subset: only allow a unique constraint if the
partition columns are in it, which means that enforcing the constraint
on each partition independently is enough to satisfy the constraint
globally.  This doesn't solve all cases where you want UNIQUE on
partitioned tables, but the reason to implement this subset is precisely
that it doesn't have those drawbacks, so it's quick and efficient.

Another point is that global indexes are not only a lot of work, but
they also come with additional drawbacks such as the necessity to
clean up when partitions are dropped/detached, which is a huge pain.

> With regard to foreign keys, it seems like there are two directions to
> worry about.  An "outbound" foreign key needs to cascade, I think,
> more or less like an index does, with every child inheriting the
> foreign key from the partition root.  An "inbound" foreign key just
> needs a unique index to point at and depend on.  Given the work you
> are doing on unique indexes, I'm guessing that you are talking about
> supporting the latter, not the former, but the way you phrased it
> sounds like the opposite.

Actually, I intend to do both.  I have a prototype patch which seems to
work, though I haven't hammered it heavily yet.  Some of the cascading
options are not implemented -- as I recall I implemented it for the
CREATE TABLE PARTITION OF case but not yet the ALTER TABLE ATTACH
PARTITION one.

> I haven't thought much about the possibility of supporting FOR EACH
> ROW triggers, but it sounds like a good idea if we can do it.  I'm not
> sure how that will work, though, if users directly access the leaf
> partitions.

Yeah, I'm not sure yet either.  Needs more thought than I've given it so
far.  I have a prototype patch for this also, and the basic case seems
to give reasonable results, but I haven't tried to find any weird corner
cases either.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-20 Thread Robert Haas
On Wed, Dec 20, 2017 at 12:01 PM, Alvaro Herrera
 wrote:
> I have two patches to rebase on top of this, which I hope to post later
> today:
>
> 1) let these indexes be unique (i.e. allow unique and PK constraints)
> 2) allow foreign keys on partitioned tables
>
> I have a further patch to allow FOR EACH ROW triggers on partitioned
> tables, but I think it's largely unrelated to this (and, as I was
> surprised to discover, it's also unrelated to the FKs one).

Sounds great!  I made the comment during my talk at PGCONF.EU that
partitioned tables in PostgreSQL are really just a bunch of tables
glued together, but that over time "we'll make the glue better", and I
think the improvements on which you are working will go a long way in
that direction.

With regard to indexes, presumably we can only have a unique index if
the index definition matches the partition key definition.  Otherwise,
we can only guarantee per-partition uniqueness, unless (1) we
implement "global" indexes, which sounds like a patch that would take
a fair amount of work, or (2) every index insertion goes and checks
for conflicting values in all of the "sibling" indexes.  This latter
approach was suggested to me by Andres, and certainly solves a bunch
of problems with vacuuming and dropping indexes, but seems like it
would probably have lousy insert performance unless the number of
partitions is very small.

With regard to foreign keys, it seems like there are two directions to
worry about.  An "outbound" foreign key needs to cascade, I think,
more or less like an index does, with every child inheriting the
foreign key from the partition root.  An "inbound" foreign key just
needs a unique index to point at and depend on.  Given the work you
are doing on unique indexes, I'm guessing that you are talking about
supporting the latter, not the former, but the way you phrased it
sounds like the opposite.

I haven't thought much about the possibility of supporting FOR EACH
ROW triggers, but it sounds like a good idea if we can do it.  I'm not
sure how that will work, though, if users directly access the leaf
partitions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-20 Thread Alvaro Herrera
Great, thanks for the input.

pg_dump behaves as described upthread -- thanks David and Robert for the
input.  I did this by injecting a fake "INDEX ATTACH" object in
pg_dump's model, which depends on the index-on-parent-table; which in
turn depends on the index-on-partition.  Because of the dependencies,
the attach is dumped last, and the index-on-parent is dumped after all
the indexes-on-partitions have been dumped.  I needed to apply a little
bit of surgery so that each table object would contain links to its
indexes, which previously wasn't there.  A followup step to obtaining
index info creates the INDEX ATTACH objects.

In the backend code: There is now a difference in initial setting of
indisvalid and indisready when creating an index.  Previously we always
set them the same (!concurrent) but now indisready depends only on
"concurrent" while indisvalid additionally depends on whether a a
non-inherited index on a partitioned table is being built.  Failing to
set indisready to true initially was causing the index to be invisible
to pg_dump.

There is no ALTER INDEX / DETACH, as discussed.  Also, trying to attach
an index for a partition that already contains an attached index raises
an error -- which means that in order to determine whether attaching an
index as partition makes the parent index valid, is a simple matter of
counting tuples.

All these behaviors are immortalized in the regression tests.

I added tab-complete support too (thanks Jesper for the reminder).  It
is probably not exhaustive, but should be good enough.

I have two patches to rebase on top of this, which I hope to post later
today:

1) let these indexes be unique (i.e. allow unique and PK constraints)
2) allow foreign keys on partitioned tables

I have a further patch to allow FOR EACH ROW triggers on partitioned
tables, but I think it's largely unrelated to this (and, as I was
surprised to discover, it's also unrelated to the FKs one).

Thanks

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/alter_index.sgml 
b/doc/src/sgml/ref/alter_index.sgml
index e54237272c..3984686d67 100644
--- a/doc/src/sgml/ref/alter_index.sgml
+++ b/doc/src/sgml/ref/alter_index.sgml
@@ -23,6 +23,7 @@ PostgreSQL documentation
 
 ALTER INDEX [ IF EXISTS ] name 
RENAME TO new_name
 ALTER INDEX [ IF EXISTS ] name 
SET TABLESPACE tablespace_name
+ALTER INDEX name ATTACH PARTITION 
index_name
 ALTER INDEX name DEPENDS ON 
EXTENSION extension_name
 ALTER INDEX [ IF EXISTS ] name 
SET ( storage_parameter = 
value [, ... ] )
 ALTER INDEX [ IF EXISTS ] name 
RESET ( storage_parameter [, ... ] 
)
@@ -76,6 +77,19 @@ ALTER INDEX ALL IN TABLESPACE name

 

+ATTACH
+
+ 
+  Causes the named index to become attached to the altered index.
+  The named index must be on a partition of the table containing the
+  index being altered, and have an equivalent definition.  An attached
+  index cannot be dropped by itself, and will automatically be dropped
+  if its parent index is dropped.
+ 
+
+   
+
+   
 DEPENDS ON EXTENSION
 
  
diff --git a/doc/src/sgml/ref/alter_table.sgml 
b/doc/src/sgml/ref/alter_table.sgml
index 7bcf242846..0a2f3e3646 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -783,7 +783,10 @@ ALTER TABLE [ IF EXISTS ] name
   as a partition of the target table. The table can be attached
   as a partition for specific values using FOR VALUES
or as a default partition by using DEFAULT
-  .
+  .  For each index in the target table, a corresponding
+  one will be created in the attached table; or, if an equivalent
+  index already exists, will be attached to the target table's index,
+  as if ALTER INDEX ATTACH had been executed.
  
 
  
@@ -844,7 +847,8 @@ ALTER TABLE [ IF EXISTS ] name
  
   This form detaches specified partition of the target table.  The detached
   partition continues to exist as a standalone table, but no longer has any
-  ties to the table from which it was detached.
+  ties to the table from which it was detached.  Any indexes that were
+  attached to the target table's indexes are detached.
  
 

diff --git a/doc/src/sgml/ref/create_index.sgml 
b/doc/src/sgml/ref/create_index.sgml
index 025537575b..b7b0506d6d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
+CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
 ( { column_name | ( 
expression ) } [ COLLATE 
collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST 
} ] [, ...] )
 [ WITH ( storage_parameter = 
value [, ... ] ) ]
 [ TABLESPACE 

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-18 Thread Robert Haas
On Mon, Dec 18, 2017 at 11:02 AM, Alvaro Herrera
 wrote:
> After this discussion, this is how I see things working:
>
> 1. pg_dump
>a) creates indexes on partitions normally
>b) once all existing indexes are done, index on parent is created,
>   with ONLY.  No cascading occurs, no indexes are attached.
>c) ATTACH is run for each existing partition index.  After each
>   ATTACH, we check that all indexes exist.  If so, the parent is
>   marked valid.
>d) if not all indexes existed in partitions, index on parent remains
>   invalid.  (It was invalid in the dumped database, so this is
>   correct.)
>
> 2. other uses
>Normal CREATE INDEX (without ONLY) recurses and attaches the first
>matching index it finds (no duplicate indexes are created);
>partitions without a matching index get one created.
>
> 3. ALTER INDEX DETACH is not provided.  Therefore: once index is valid,
>it remains so forever.
>
> I think this satisfies all concerns.

Sounds great to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-18 Thread Alvaro Herrera
After this discussion, this is how I see things working:

1. pg_dump
   a) creates indexes on partitions normally
   b) once all existing indexes are done, index on parent is created,
  with ONLY.  No cascading occurs, no indexes are attached.
   c) ATTACH is run for each existing partition index.  After each
  ATTACH, we check that all indexes exist.  If so, the parent is
  marked valid.
   d) if not all indexes existed in partitions, index on parent remains
  invalid.  (It was invalid in the dumped database, so this is
  correct.)

2. other uses
   Normal CREATE INDEX (without ONLY) recurses and attaches the first
   matching index it finds (no duplicate indexes are created);
   partitions without a matching index get one created.

3. ALTER INDEX DETACH is not provided.  Therefore: once index is valid,
   it remains so forever.

I think this satisfies all concerns.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread Robert Haas
On Sun, Dec 17, 2017 at 10:05 PM, David Rowley
 wrote:
> I think you feel quite strongly about not having the code select a
> random matching index, so if we want to stick to that rule, then we'll
> need to create a set of new leaf indexes rather than select a random
> one.

I feel quite strongly about it *in the context of pg_dump*.  Outside
of that scenario, I'm happy to go with whatever behavior you and
others think will satisfy the POLA.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread David Rowley
On 18 December 2017 at 15:59, Robert Haas  wrote:
> On Sun, Dec 17, 2017 at 9:38 PM, David Rowley
>  wrote:
>> On 18 December 2017 at 15:04, Robert Haas  wrote:
>>> On Sun, Dec 17, 2017 at 5:29 AM, David Rowley
>>>  wrote:
 I'm now not that clear on what the behaviour is if the ONLY keyword is
 not specified on the CREATE INDEX for the partitioned index. Does that
 go and create each leaf partition index regardless of if there is a
 suitable candidate to ATTACH?
>>>
>>> No, the other way around.  ONLY is being proposed as a way to create
>>> an initially-not-valid parent to which we can then ATTACH
>>> subsequently-created child indexes.  But because we will have REPLACE
>>> rather than DETACH, once you get the index valid it never goes back to
>>> not-valid.
>>
>> I understand what the ONLY is proposed to do. My question was in
>> regards to the behaviour without ONLY.
>
> Oh, sorry -- I was confused.  I'm not sure whether that should try to
> attach to something if it exists, or just create unconditionally...
> what do you think?

I think you feel quite strongly about not having the code select a
random matching index, so if we want to stick to that rule, then we'll
need to create a set of new leaf indexes rather than select a random
one.

If we don't do that, then we might as well go with my idea and ditch
the ONLY syntax and have the CREATE INDEX try to find a suitable leaf
index, or create a new leaf index if one cannot be found.

Would it be surprising to users if CREATE INDEX ON partitioned_table
created a bunch of duplicate new indexes on the leaf tables?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread Robert Haas
On Sun, Dec 17, 2017 at 9:38 PM, David Rowley
 wrote:
> On 18 December 2017 at 15:04, Robert Haas  wrote:
>> On Sun, Dec 17, 2017 at 5:29 AM, David Rowley
>>  wrote:
>>> I'm now not that clear on what the behaviour is if the ONLY keyword is
>>> not specified on the CREATE INDEX for the partitioned index. Does that
>>> go and create each leaf partition index regardless of if there is a
>>> suitable candidate to ATTACH?
>>
>> No, the other way around.  ONLY is being proposed as a way to create
>> an initially-not-valid parent to which we can then ATTACH
>> subsequently-created child indexes.  But because we will have REPLACE
>> rather than DETACH, once you get the index valid it never goes back to
>> not-valid.
>
> I understand what the ONLY is proposed to do. My question was in
> regards to the behaviour without ONLY.

Oh, sorry -- I was confused.  I'm not sure whether that should try to
attach to something if it exists, or just create unconditionally...
what do you think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread David Rowley
On 18 December 2017 at 15:04, Robert Haas  wrote:
> On Sun, Dec 17, 2017 at 5:29 AM, David Rowley
>  wrote:
>> I'm now not that clear on what the behaviour is if the ONLY keyword is
>> not specified on the CREATE INDEX for the partitioned index. Does that
>> go and create each leaf partition index regardless of if there is a
>> suitable candidate to ATTACH?
>
> No, the other way around.  ONLY is being proposed as a way to create
> an initially-not-valid parent to which we can then ATTACH
> subsequently-created child indexes.  But because we will have REPLACE
> rather than DETACH, once you get the index valid it never goes back to
> not-valid.

I understand what the ONLY is proposed to do. My question was in
regards to the behaviour without ONLY.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread Robert Haas
On Sun, Dec 17, 2017 at 5:29 AM, David Rowley
 wrote:
>> If ALTER INDEX .. ATTACH is already taking AEL on the parent, then I
>> think it might as well try to validate while it's at it.  But if not
>> then we might want to go with #2.
>
> I'm now not that clear on what the behaviour is if the ONLY keyword is
> not specified on the CREATE INDEX for the partitioned index. Does that
> go and create each leaf partition index regardless of if there is a
> suitable candidate to ATTACH?

No, the other way around.  ONLY is being proposed as a way to create
an initially-not-valid parent to which we can then ATTACH
subsequently-created child indexes.  But because we will have REPLACE
rather than DETACH, once you get the index valid it never goes back to
not-valid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread Robert Haas
On Sun, Dec 17, 2017 at 10:50 AM, Alvaro Herrera
 wrote:
>> If ALTER INDEX .. ATTACH is already taking AEL on the parent, then I
>> think it might as well try to validate while it's at it.  But if not
>> then we might want to go with #2.
>
> The problem I have with it is that restoring a dump containing indexes
> on partitions becomes a O(N^2) deal as it has to do the full check once
> for every index we attach.

Sure.  If the constant factor is high enough to matter, then VALIDATE
makes sense.

IMHO, anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Dec 15, 2017 at 5:18 PM, Alvaro Herrera  
> wrote:
> > We have two options for marking valid:
> >
> > 1. after each ALTER INDEX ATTACH, verify whether the set of partitions
> > that contain the index is complete; if so, mark it valid, otherwise do
> > nothing.  This sucks because we have to check that over and over for
> > every index that we attach
> >
> > 2. We invent yet another command, say
> > ALTER INDEX  VALIDATE
> 
> If ALTER INDEX .. ATTACH is already taking AEL on the parent, then I
> think it might as well try to validate while it's at it.  But if not
> then we might want to go with #2.

The problem I have with it is that restoring a dump containing indexes
on partitions becomes a O(N^2) deal as it has to do the full check once
for every index we attach.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-17 Thread David Rowley
On 17 December 2017 at 16:22, Robert Haas  wrote:
> On Fri, Dec 15, 2017 at 5:18 PM, Alvaro Herrera  
> wrote:
>> We have two options for marking valid:
>>
>> 1. after each ALTER INDEX ATTACH, verify whether the set of partitions
>> that contain the index is complete; if so, mark it valid, otherwise do
>> nothing.  This sucks because we have to check that over and over for
>> every index that we attach
>>
>> 2. We invent yet another command, say
>> ALTER INDEX  VALIDATE

It's not perfect that we need to validate each time, but it might not
be that expensive to validate since we only really need to count the
pg_index rows that have indisvalid = true rows which have a parent
index listed as the index we're ATTACHing too, then ensure that
matches the number of leaf partitions.

> If ALTER INDEX .. ATTACH is already taking AEL on the parent, then I
> think it might as well try to validate while it's at it.  But if not
> then we might want to go with #2.

I'm now not that clear on what the behaviour is if the ONLY keyword is
not specified on the CREATE INDEX for the partitioned index. Does that
go and create each leaf partition index regardless of if there is a
suitable candidate to ATTACH?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-16 Thread Robert Haas
On Fri, Dec 15, 2017 at 5:18 PM, Alvaro Herrera  wrote:
> We have two options for marking valid:
>
> 1. after each ALTER INDEX ATTACH, verify whether the set of partitions
> that contain the index is complete; if so, mark it valid, otherwise do
> nothing.  This sucks because we have to check that over and over for
> every index that we attach
>
> 2. We invent yet another command, say
> ALTER INDEX  VALIDATE

If ALTER INDEX .. ATTACH is already taking AEL on the parent, then I
think it might as well try to validate while it's at it.  But if not
then we might want to go with #2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Dec 15, 2017 at 4:02 PM, Alvaro Herrera  
> wrote:
> > 3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
> >index on parent only (no recursion), followed by CREATE INDEX ON
> >.  DETACH is not provided.  If you ATTACH an index for a
> >partition that already has one index attached, then (1) the newly
> >attached one replaces the original (i.e. effectively REPLACE) or (2)
> >you get an error and we implement a separate ALTER INDEX REPLACE
> >command.  It's not clear to me how or when the shell index becomes a
> >real index.
> 
> With this proposal, I think the index can be invalid initially, but
> once you've attached an index for every child partition, it becomes
> irrevocably valid.  After that, the only supported operation is
> REPLACE, which preserves validity.

Sounds okay to me.  (I had already deleted ALTER INDEX DETACH from my
patch earlier today.)  I admit I had also deleted the ONLY clause from
CREATE INDEX, and I don't like having to put it back :-)  But on the
whole, having it sounds better than the alternatives.

We have two options for marking valid:

1. after each ALTER INDEX ATTACH, verify whether the set of partitions
that contain the index is complete; if so, mark it valid, otherwise do
nothing.  This sucks because we have to check that over and over for
every index that we attach

2. We invent yet another command, say
ALTER INDEX  VALIDATE

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera
Robert Haas wrote:
> On Thu, Nov 30, 2017 at 7:02 AM, Alvaro Herrera  
> wrote:
> > Great question.  So you're thinking that the planner might have an
> > interest in knowing what indexes are defined at the parent table level
> > for planning purposes; but for that to actually have any effect we would
> > need to change the planner and executor also.  And one more point, also
> > related to something you said before: we currently (I mean after my
> > patch) don't mark partitioned-table-level indexes as valid or not valid
> > depending on whether all its children exist, so trying to use that in
> > the planner without having a flag could cause invalid plans to be
> > generated (i.e. ones that would cause nonexistent indexes to be
> > referenced).
> 
> Did you do it this way due to locking concerns?

No -- just because since the index-on-parent is a different kind of
object (RELKIND_PARTITIONED_INDEX) it is not considered for anything in
the planner anyway, so there's no need for indisvalid to be "correct".
Changing the flag in the parent index only needs to examine state on the
children, not modify them, so I don't think there would be any serious
locking problem.


By the way, my implementation of ALTER INDEX ATTACH PARTITION was prone
to deadlocks because it needs locks on parent table, index-on-parent,
partition, and index-on-partition; and there was no consideration to the
ordering in which these were being acquired.  I wrote a callback for
RangeVarGetRelidExtended to be called in ATExecAttachPartitionIdx() that
tries to acquire locks in the right order -- but I recently realized
that even that is not sufficient, because (unless I misread it) ALTER
INDEX itself does not worry about locking the containing table before
the index.  I think some tweaking needs to be done in
RangeVarCallbackForAlterRelation to lock the table if an index is being
altered (conditionally, depending on the precise ALTER TABLE operation).
It surprises me that we don't need to do that yet, but I haven't looked
into it any further.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Robert Haas
On Fri, Dec 15, 2017 at 4:02 PM, Alvaro Herrera  wrote:
> 3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
>index on parent only (no recursion), followed by CREATE INDEX ON
>.  DETACH is not provided.  If you ATTACH an index for a
>partition that already has one index attached, then (1) the newly
>attached one replaces the original (i.e. effectively REPLACE) or (2)
>you get an error and we implement a separate ALTER INDEX REPLACE
>command.  It's not clear to me how or when the shell index becomes a
>real index.

With this proposal, I think the index can be invalid initially, but
once you've attached an index for every child partition, it becomes
irrevocably valid.  After that, the only supported operation is
REPLACE, which preserves validity.

> Robert, can you please clarify the terms of your proposal?  How is it
> better than mine?  Is David's concern about a "partial" index (i.e. an
> index that doesn't exist in some partition) solved by it?

I think the perceived advantage is that, once valid, the index can't
subsequently become not-valid.  That seemed to be David's big concern
(which is not without foundation).

> I have code for proposals 1 and 2.  I don't like proposal 2, and David &
> Ashutosh don't like (1).  Maybe if we all understand (3) we can agree on
> using that one?

Yes, it would be nice to achieve some sort of consensus and I think
(3) gives everyone a little of what they want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera
Hmm, so I'm now unsure what the actual proposals for handling pg_dump
are.  We seem to have the following three proposals:

1. Alvaro: use CREATE INDEX ON ONLY  (not recursive ), followed
   by CREATE INDEX ON , followed by ALTER INDEX 
   ATTACH PARTITION .  I provide an ALTER INDEX DETACH
   PARTITION for symmetry and because it can be used to replace the
   index.

   Pros: the database is always restored identically to what was in the
   original.
   Con:  The index hierarchy might be "partial", that is, lack a
   component index on some partition.

2. David's: use CREATE INDEX ON , followed by CREATE INDEX ON
   .  This will use the matching mechanism to automatically
   attach the index on partition to index on parent.  If some partition
   lacks a matching index, one is created automatically by the creation
   on parent.

   If you want to replace the index on a partition, use a new (as yet
   unimplemented) ALTER INDEX REPLACE.

   No need to add ONLY to the table name in CREATE INDEX, since the
   command always recurses.  (This seems good to me, because I 

   Pro: the index is never "partial" (missing a partition).
   Con: the matching mechanism might choose a different index on restore
   than what was selected in the database being dumped.

3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
   index on parent only (no recursion), followed by CREATE INDEX ON
   .  DETACH is not provided.  If you ATTACH an index for a
   partition that already has one index attached, then (1) the newly
   attached one replaces the original (i.e. effectively REPLACE) or (2)
   you get an error and we implement a separate ALTER INDEX REPLACE
   command.  It's not clear to me how or when the shell index becomes a
   real index.


Robert, can you please clarify the terms of your proposal?  How is it
better than mine?  Is David's concern about a "partial" index (i.e. an
index that doesn't exist in some partition) solved by it?

I have code for proposals 1 and 2.  I don't like proposal 2, and David &
Ashutosh don't like (1).  Maybe if we all understand (3) we can agree on
using that one?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-12 Thread Robert Haas
On Mon, Dec 11, 2017 at 10:04 PM, Alvaro Herrera
 wrote:
> David Rowley wrote:
>
>> ATTACH/REPLACE sounds fine. My objection was more about the
>> DETACH/ATTACH method to replace an index.
>
> So what happens if you do ALTER INDEX .. ATTACH and you already have
> another index in that partition that is attached to the same parent in
> the index?

I think that should be an ERROR.  You can use REPLACE if you want to
switch which index is attached, but you shouldn't be able to attach
two indexes from the same partition at the same time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-12 Thread Alvaro Herrera
David Rowley wrote:

> ATTACH/REPLACE sounds fine. My objection was more about the
> DETACH/ATTACH method to replace an index.

So what happens if you do ALTER INDEX .. ATTACH and you already have
another index in that partition that is attached to the same parent in
the index?  With my code, what happens is you have two indexes attached
to the same parent, and you can't drop any.  If we don't have DETACH,
how can you get out of that situation?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-07 Thread Robert Haas
On Thu, Dec 7, 2017 at 4:57 PM, David Rowley
 wrote:
>> Sure, that would fix the problem I'm concerned about, but creating the
>> parent index first, as a shell, and then creating each child and
>> attaching it to the parent *also* fixes the problem I'm concerned
>> about, and without dragging any bystander objects into the fray.
>
> That's true, but is it worth inventing/maintaining an ATTACH syntax
> for that? It's not a very common case that multiple matching indexes
> existing. If it is worth it, do we need DETACH at all?

I think it is totally worth it.  A little piece of extra DDL syntax
isn't that really costing us anything.  Your proposed approach
probably still has obscure failure cases - e.g. I bet the
deadlock-avoidance logic in parallel restore won't know about the
dependency on a seemingly-unrelated object.  Also, the use of a
seemingly-useless REPLACE syntax in every dump file will probably
confuse at least a few users, and maybe a few developers, too.  I
think there is considerable value, both for the system and for the
humans who use it, in having something that has *exactly* the
semantics we want rather than *almost* the semantics we want.

I suppose if we want to get cute, we could have ONLY the ATTACH
syntax; if you attach an index for a partition that already has an
index attached, that could mean attach to the new one instead of the
old one (i.e. replace).  But I would just add support for both ATTACH
and REPLACE and call it good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-07 Thread Robert Haas
On Thu, Dec 7, 2017 at 4:43 PM, David Rowley
 wrote:
> And yeah, this does nothing for making sure we choose the correct
> index if more than one matching index exists on the leaf partition,
> but perhaps we can dump a series of
>
> ALTER INDEX p_a_idx REPLACE INDEX FOR p1 WITH p1_a_idx;
> ALTER INDEX p_a_idx REPLACE INDEX FOR p2 WITH p2_a_idx;
>
> ... which would be no-ops most of the time, but at least would ensure
> we use the correct index. (Likely we could fix the FOREIGN KEY
> constraint choosing the first matching index with some variation of
> this syntax)

Sure, that would fix the problem I'm concerned about, but creating the
parent index first, as a shell, and then creating each child and
attaching it to the parent *also* fixes the problem I'm concerned
about, and without dragging any bystander objects into the fray.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-07 Thread David Rowley
On 8 December 2017 at 10:07, Robert Haas  wrote:
> I do agree with you that an index which is currently enforcing a
> unique constraint has to remain continuously valid -- or if it
> unavoidably doesn't, for example if we allowed an unlogged unique
> index on a logged table, then we'd have to do something unpleasant
> like disallow inserts and updates to the key column until that gets
> fixed.  However, that doesn't seem to preclude gracefully swapping out
> indexes for individual partitions; instead of providing a DETACH
> operation, we could provide a REPLACE operation that effectively does
> DETACH + ATTACH.

Yeah, that's what I've been trying to push for. I've mentioned a handy
wavy REPLACE syntax a few times.

> It's possible that we are not that far apart here.  I don't like the
> ATTACH syntax because it's like what we do for partitions; I like it
> because it solves the pg_dump problem.  And it seems to me that your
> reservations are more about DETACH than ATTACH.  I have no issue with
> punting DETACH to the curb, recasting it as REPLACE, or whatever.

I just don't quite see the pg_dump problem with my proposed approach.

Let me pseudo write out what I imagine in a pg_dump file.

CREATE TABLE p (a INT NOT NULL, b INT NOT NULL) PARTITION BY RANGE (a);

CREATE TABLE p1 PARTITION OF p FOR VALUES FROM (1) TO (10);
CREATE TABLE p2 PARTITION OF p FOR VALUES FROM (10) TO (20);



-- create indexes on partitions
CREATE INDEX p1_a_idx ON p1 (a); -- normal create index
CREATE INDEX p2_a_idx ON p2 (a); -- normal create index.

COMMENT ON INDEX p1_a_idx IS 'the comment';

-- create partitioned indexes

-- checks indexes exist for each partition of p with matching columns,
-- AM and unique property then perform a metadata only change to
-- say there's an index on this table, also updates each index uses to
-- say its parent index is this index. If someone happens to have edited
-- the dump file to remove one of the indexes on a leaf partition then the
-- following would have to create that index again.
CREATE INDEX p_a_idx ON p (a);

--
-- PostgreSQL database dump complete
--

We'd just need to ensure the indexes of leafs are created in an order
that takes into account the dependency order.

And yeah, this does nothing for making sure we choose the correct
index if more than one matching index exists on the leaf partition,
but perhaps we can dump a series of

ALTER INDEX p_a_idx REPLACE INDEX FOR p1 WITH p1_a_idx;
ALTER INDEX p_a_idx REPLACE INDEX FOR p2 WITH p2_a_idx;

... which would be no-ops most of the time, but at least would ensure
we use the correct index. (Likely we could fix the FOREIGN KEY
constraint choosing the first matching index with some variation of
this syntax)

Also, DROP INDEX should be disallowed on a leaf partition index which
is in use for a partitioned index. I imagine pg_index just needs a new
column indparentid which would be InvalidOid if it's not used. I'm
just not that clear on if that column should be set to the leaf
partition's direct parent, or the parent that the CREATE INDEX was
done on.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-05 Thread David Rowley
On 6 December 2017 at 11:35, Robert Haas  wrote:
> What are we giving up by explicitly attaching
> the correct index?

The part I don't like about the ATTACH and DETACH of partitioned index
is that it seems to be trying to just follow the syntax we use to
remove a partition from a partitioned table, however, there's a huge
difference between the two, as DETACHing a partition from a
partitioned table leaves the partitioned table in a valid state, it
simply just no longer contains the detached partition. With the
partitioned index, we leave the index in an invalid state after a
DETACH. It can only be made valid again once another leaf index has
been ATTACHED again and that we've verified that all other indexes on
every leaf partition is also there and are valid. If we're going to
use these indexes to answer queries, then it seems like we should try
to keep them valid so that queries can actually use them for
something.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-05 Thread David Rowley
On 6 December 2017 at 09:58, Alvaro Herrera  wrote:
> David Rowley wrote:
>> On 6 December 2017 at 04:29, Robert Haas  wrote:
>> > How does that proposal keep pg_dump from latching onto the wrong
>> > index, if there's more than one index on the same columns?
>>
>> I'm not hugely concerned about that. It's not a new problem and it's
>> not a problem that I recall seeing anyone complain about, at least not
>> to the extent that we've ever bothered to fix it.
>
> Another reason to do things the proposed way is to let parallel restore
> create the indexes in parallel.  If we just have a single CREATE INDEX
> that cascades to the partitions, that can be run by a single backend
> only, which is a loser.

I'm not all that sure why parallel restore could not work with this.
The leaf partition indexes would all be created first, then the CREATE
INDEX for the partitioned index would just be a metadata change with
some checks to ensure all the leaf partition indexes exist.

> (There's also the fact that you'd lose any COMMENTs etc).

I can't see why that would be a problem with my proposal.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-05 Thread David Rowley
On 6 December 2017 at 04:29, Robert Haas  wrote:
> On Mon, Dec 4, 2017 at 6:57 PM, David Rowley
>> I proposed that this worked a different way in [1]. I think the way I
>> mention is cleaner as it means there's no extra reason for a
>> partitioned index to be indisvalid=false than there is for any other
>> normal index.
>
> How does that proposal keep pg_dump from latching onto the wrong
> index, if there's more than one index on the same columns?

I'm not hugely concerned about that. It's not a new problem and it's
not a problem that I recall seeing anyone complain about, at least not
to the extent that we've ever bothered to fix it.

The existing problem is with FOREIGN KEY constraints just choosing the
first matching index in transformFkeyCheckAttrs()

We can see the issue today with:

create table t1 (id int not null);
create unique index t1_idx_b on t1 (id);
create table t2 (id int references t1 (id));
create unique index t1_idx_a on t1 (id);




# drop index t1_idx_a;
ERROR:  cannot drop index t1_idx_a because other objects depend on it
DETAIL:  constraint t2_id_fkey on table t2 depends on index t1_idx_a
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

>> My primary reason for not liking this way is around leaving indexes
>> around as indisvalid=false, however, I suppose that an index could be
>> replaced atomically with a DETACH and ATTACH within a single
>> transaction. I had previously not really liked the idea of
>> invalidating an index by DETACHing a leaf table's index from it. Of
>> course, this patch does nothing special with partitioned indexes, but
>> I believe one day we will want to do something with these and that the
>> DETACH/ATTACH is not the best design for replacing part of the index.
>
> What do you propose?

For this patch, I propose we do nothing about that, but for the
future, I already mentioned an idea to do this above.

> I'd have thought some sort of: ALTER INDEX name_of_partitioned_index
> REPLACE INDEX FOR partitioned_tablename WITH
> name_of_new_matching_bloat_free_index;

This way there is no period where we have to mark the index as
indisvalid=false. So, for when we invent something that uses this
feature, there's no window of time that concurrently running queries
are in danger of generating a poor plan. The index swap operation is
atomic.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-04 Thread David Rowley
On 2 December 2017 at 03:39, Robert Haas  wrote:
> On Thu, Nov 30, 2017 at 11:39 PM, David Rowley
>  wrote:
>> I feel like we could do better here with little extra effort. The
>> DETACH index feature does not really seem required for this patch.
>
> Because of the dump/restore considerations mentioned in
> http://postgr.es/m/ca+tgmobuhghg9v8saswkhbbfywg5a0qb+jgt0uovq5ycbdu...@mail.gmail.com
> I believe we need a way to create the index on the parent without
> immediately triggering index builds on the children, plus a way to
> create an index on a child after-the-fact and attach it to the parent.
> Detach isn't strictly required, but why support attach and not detach?

I proposed that this worked a different way in [1]. I think the way I
mention is cleaner as it means there's no extra reason for a
partitioned index to be indisvalid=false than there is for any other
normal index.

My primary reason for not liking this way is around leaving indexes
around as indisvalid=false, however, I suppose that an index could be
replaced atomically with a DETACH and ATTACH within a single
transaction. I had previously not really liked the idea of
invalidating an index by DETACHing a leaf table's index from it. Of
course, this patch does nothing special with partitioned indexes, but
I believe one day we will want to do something with these and that the
DETACH/ATTACH is not the best design for replacing part of the index.

[1] 
https://www.postgresql.org/message-id/CAKJS1f_o6v%2BXtT%2B3gfieUdCiU5Sn84humT-CVW5So_x_f%3DkLxQ%40mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-04 Thread David Rowley
On 2 December 2017 at 11:10, Alvaro Herrera  wrote:
> What you're saying is that I've written code for A+B, and you're
> "interested in C (which is incompatible with B), so can we have A+C and
> drop B".  But in reality, there exists (unwritten) D that solves the
> incompatiblity between B and C.  I'm just saying it's essentially the
> same to postpone C+D than to postpone B+D, and I already have B written;
> plus that way we don't have to come up with some novel way to handle
> pg_dump support.  So can we get A+B committed and discuss C+D later?
>
> A = partitioned indexes
> B = pg_dump support based on ATTACH
> C = your proposed planner stuff
> D = correct indisvalid setting for partitioned indexes (set to false
> when a partition does not contain the index)
>
> The patch in this thread is A+B.

Okay,  I wasn't insisting, just asking if you thought this was missing
from the patch.

However, I do still feel that if we're adding an index to an object
then it should be available in RelOptInfo->indexlist, but this patch
is still good progress even if we don't add it there.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-02 Thread Alvaro Herrera
David Rowley wrote:

> So, then this patch is only really intended as a syntax shortcut for
> mass adding of indexes to each partition?

This patch is intended to serve as a basis on which to construct further
features, just like every other patch we apply.

> I feel like we could do better here with little extra effort. The
> DETACH index feature does not really seem required for this patch. I
> think just ensuring a matching index exists on each leaf partition and
> creating any which don't exist before creating the index on the target
> partitioned table seems like the correct solution. That way we can
> make that index indisvalid flag have a valid meaning all the time.
> Some later patch can invent some way to replace a bloated index.

What you're saying is that I've written code for A+B, and you're
"interested in C (which is incompatible with B), so can we have A+C and
drop B".  But in reality, there exists (unwritten) D that solves the
incompatiblity between B and C.  I'm just saying it's essentially the
same to postpone C+D than to postpone B+D, and I already have B written;
plus that way we don't have to come up with some novel way to handle
pg_dump support.  So can we get A+B committed and discuss C+D later?

A = partitioned indexes
B = pg_dump support based on ATTACH
C = your proposed planner stuff
D = correct indisvalid setting for partitioned indexes (set to false
when a partition does not contain the index)

The patch in this thread is A+B.

> Perhaps later we can invent some generic way to replace a physical
> leaf index for a given partitioned index perhaps with the same patch
> that might allow us to replace an index which is used by a constraint,
> which to me seems like a feature we should have had years ago.

This is a hypothetical feature E which would be nice (for partitioned
indexes and for ordinary indexes too) but is not strictly necessary.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-01 Thread Robert Haas
On Thu, Nov 30, 2017 at 11:39 PM, David Rowley
 wrote:
> I feel like we could do better here with little extra effort. The
> DETACH index feature does not really seem required for this patch.

Because of the dump/restore considerations mentioned in
http://postgr.es/m/ca+tgmobuhghg9v8saswkhbbfywg5a0qb+jgt0uovq5ycbdu...@mail.gmail.com
I believe we need a way to create the index on the parent without
immediately triggering index builds on the children, plus a way to
create an index on a child after-the-fact and attach it to the parent.
Detach isn't strictly required, but why support attach and not detach?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread David Rowley
On 1 December 2017 at 01:02, Alvaro Herrera  wrote:
> we currently (I mean after my
> patch) don't mark partitioned-table-level indexes as valid or not valid
> depending on whether all its children exist, so trying to use that in
> the planner without having a flag could cause invalid plans to be
> generated (i.e. ones that would cause nonexistent indexes to be
> referenced).

So, then this patch is only really intended as a syntax shortcut for
mass adding of indexes to each partition?

I feel like we could do better here with little extra effort. The
DETACH index feature does not really seem required for this patch. I
think just ensuring a matching index exists on each leaf partition and
creating any which don't exist before creating the index on the target
partitioned table seems like the correct solution. That way we can
make that index indisvalid flag have a valid meaning all the time.
Some later patch can invent some way to replace a bloated index.

Perhaps later we can invent some generic way to replace a physical
leaf index for a given partitioned index perhaps with the same patch
that might allow us to replace an index which is used by a constraint,
which to me seems like a feature we should have had years ago.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread Alvaro Herrera
Michael Paquier wrote:
> On Wed, Nov 15, 2017 at 2:49 AM, Alvaro Herrera  
> wrote:
> > Here's the remaining bits, rebased.
> 
> At least patch 3 has conflicts with HEAD. I am moving this patch to
> next CF per a lack of reviews, switching status to waiting on author.

Thanks, will submit a new version before the start of the next
commitfest.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread Alvaro Herrera
David Rowley wrote:

>  I wonder if it should be this patches job to alter the code in
> get_relation_info() which causes the indexes not to be loaded for
> partitioned tables:
> 
> /*
> * Make list of indexes.  Ignore indexes on system catalogs if told to.
> * Don't bother with indexes for an inheritance parent, either.
> */
> if (inhparent ||
> (IgnoreSystemIndexes && IsSystemRelation(relation)))
> hasindex = false;
> else
> hasindex = relation->rd_rel->relhasindex;
> 
> A partitioned table will always go into the hasindex = false code path.
> 
> I'm kind of thinking this patch should change that, even if the patch is
> not making use of the indexes, you could argue that something
> using set_rel_pathlist_hook might want to do something there, although,
> there's likely a bunch of counter arguments too.
> 
> What do you think?

Great question.  So you're thinking that the planner might have an
interest in knowing what indexes are defined at the parent table level
for planning purposes; but for that to actually have any effect we would
need to change the planner and executor also.  And one more point, also
related to something you said before: we currently (I mean after my
patch) don't mark partitioned-table-level indexes as valid or not valid
depending on whether all its children exist, so trying to use that in
the planner without having a flag could cause invalid plans to be
generated (i.e. ones that would cause nonexistent indexes to be
referenced).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-17 Thread David Rowley
On 15 November 2017 at 06:49, Alvaro Herrera 
wrote:

> Here's the remaining bits, rebased.


 I wonder if it should be this patches job to alter the code in
get_relation_info() which causes the indexes not to be loaded for
partitioned tables:

/*
* Make list of indexes.  Ignore indexes on system catalogs if told to.
* Don't bother with indexes for an inheritance parent, either.
*/
if (inhparent ||
(IgnoreSystemIndexes && IsSystemRelation(relation)))
hasindex = false;
else
hasindex = relation->rd_rel->relhasindex;

A partitioned table will always go into the hasindex = false code path.

I'm kind of thinking this patch should change that, even if the patch is
not making use of the indexes, you could argue that something
using set_rel_pathlist_hook might want to do something there, although,
there's likely a bunch of counter arguments too.

What do you think?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-17 Thread David Rowley
On 15 November 2017 at 06:49, Alvaro Herrera 
wrote:

> Here's the remaining bits, rebased.


Hi,

I've not had time for a thorough look at  this, but on a quick scan I
noticed that CompareIndexInfo() missed checking if the Index AM matches the
AM of the partitioned index.

Testing with:

create table p (a int not null) partition by range (a);
create table p1 partition of p for values from (1) to (10);
create table p2 partition of p for values from (10) to (20);
create index on p1 using btree (a);
create index on p2 using hash (a);
create index on p (a);

I see it ends up making use of the hash index on p2 to support the index
that's stored as a btree on the partitioned table. I think these should
match so that the operations we can perform on the index are all aligned.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread David Rowley
On 15 November 2017 at 04:23, Alvaro Herrera  wrote:
> David Rowley wrote:
>> I'd have thought some sort of: ALTER INDEX name_of_partitioned_index
>> REPLACE INDEX FOR partitioned_tablename WITH
>> name_of_new_matching_bloat_free_index;
>>
>> ... or something along those lines, and just have an atomic swap out
>> of the index with some new one that's been created.
>
> (My intention here is to avoid scope creep.)

OK, I didn't really mean that this would be required for an initial
patch. It's something that could be invented at some later date.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread Alvaro Herrera
Hi Jesper,

Thanks for reviewing.

Jesper Pedersen wrote:

> I have been looking at the "CREATE INDEX ... ONLY" syntax, and I think it
> could cause some confusion due to the "Note" described in create_index.sgml.
> 
> An alternative, maybe crazy, could be to treat a partitioned index as one;
> e.g. all operations are on the definition.

Well, honestly that's the way I wanted partitioning to be defined, right
from the start -- my first proposal was that partitions were not to be
standalone objects.  But I was outvoted once I stepped out of the
implementation, and partitioning is now the way we have it -- each
partition its own separate object.  Until we change that view (if we
ever do), I think the only sensible decision is that the whole feature
works that way.

> Some comments. [...]

Thanks, will fix.  Except for this one:

> A small thing, for
> 
> -- test.sql --
> CREATE TABLE test (a integer NOT NULL) PARTITION BY HASH(a);
> CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2,
> REMAINDER 0);
> CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2,
> REMAINDER 1);
> CREATE INDEX idx_test_a ON test (a);
> -- test.sql --
> 
> I would expect that the index names were 'test_p00_idx_test_a' and
> 'test_p01_idx_test_a'.

Hmm, the code in my patch just uses the standard naming routine for
indexes.  I'm disinclined to change it in the initial commit, but
perhaps we can change that later.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread Jesper Pedersen

Hi,

On 11/14/2017 12:49 PM, Alvaro Herrera wrote:

Thanks, pushed.


Here's the remaining bits, rebased.



First of all, thanks for working on this.

I have been looking at the "CREATE INDEX ... ONLY" syntax, and I think 
it could cause some confusion due to the "Note" described in 
create_index.sgml.


An alternative, maybe crazy, could be to treat a partitioned index as 
one; e.g. all operations are on the definition. That way ONLY, ATTACH 
and DETACH could be eliminated. Once a partition is detached any 
partitioned indexes would be marked as a normal index, and a partition 
could only be attached if it had indexes satisfying all partition index 
definitions. Bloat could be handled by swapping the index as suggested 
by David [1]. It would be less flexible, but people always have the 
option to define indexes directly on the partitions.


Some comments.

0004's alter_index.sgml are missing the "PARTITION" keyword for both the 
ATTACH and DETACH commands.


A small thing, for

-- test.sql --
CREATE TABLE test (a integer NOT NULL) PARTITION BY HASH(a);
CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

CREATE INDEX idx_test_a ON test (a);
-- test.sql --

I would expect that the index names were 'test_p00_idx_test_a' and 
'test_p01_idx_test_a'.


psql completion for "CREATE INDEX blah ON tes" only lists the child 
tables. Completion for the ALTER INDEX commands is missing.


[1] 
https://www.postgresql.org/message-id/CAKJS1f_Wf%3DM06o8iQi72SxN%2BZpLV4c0CwYoN5xYVY4aWWX-jBQ%40mail.gmail.com


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Simon Riggs wrote:
> > On 13 November 2017 at 12:55, Alvaro Herrera  
> > wrote:
> > > Somehow I managed to include an unrelated patch as attachment.  Here's
> > > another attempt (on which I also lightly touched ddl.sgml with relevant
> > > changes).
> > 
> > Looks good. Some minor comments below.
> > 
> > 0001- Simplify
> > Seems useful as separate step; agree with everything, no further comments
> 
> Thanks, pushed.

Here's the remaining bits, rebased.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From c9bb7365be8cc542660379632369bfebe5e2d64b Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 26 Oct 2017 11:15:39 +0200
Subject: [PATCH v5 1/4] Get rid of copy_partition_key

Instead, allocate the objects in a temporary context right from the
start, and only make it permanent once no errors can occur anymore.

Reviewed-by: Robert Haas, Tom Lane
Discussion: https://postgr.es/m/20171027172730.eh2domlkpn4ja62m@alvherre.pgsql
---
 src/backend/utils/cache/relcache.c | 69 +-
 1 file changed, 8 insertions(+), 61 deletions(-)

diff --git a/src/backend/utils/cache/relcache.c 
b/src/backend/utils/cache/relcache.c
index 1908420d82..59280cd8bb 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -262,7 +262,6 @@ static Relation AllocateRelationDesc(Form_pg_class relp);
 static void RelationParseRelOptions(Relation relation, HeapTuple tuple);
 static void RelationBuildTupleDesc(Relation relation);
 static void RelationBuildPartitionKey(Relation relation);
-static PartitionKey copy_partition_key(PartitionKey fromkey);
 static Relation RelationBuildDesc(Oid targetRelId, bool insertIt);
 static void RelationInitPhysicalAddr(Relation relation);
 static void load_critical_index(Oid indexoid, Oid heapoid);
@@ -846,6 +845,11 @@ RelationBuildPartitionKey(Relation relation)
if (!HeapTupleIsValid(tuple))
return;
 
+   partkeycxt = AllocSetContextCreate(CurTransactionContext,
+  
RelationGetRelationName(relation),
+  
ALLOCSET_SMALL_SIZES);
+   oldcxt = MemoryContextSwitchTo(partkeycxt);
+
key = (PartitionKey) palloc0(sizeof(PartitionKeyData));
 
/* Fixed-length attributes */
@@ -983,71 +987,14 @@ RelationBuildPartitionKey(Relation relation)
 
ReleaseSysCache(tuple);
 
-   /* Success --- now copy to the cache memory */
-   partkeycxt = AllocSetContextCreate(CacheMemoryContext,
-  
RelationGetRelationName(relation),
-  
ALLOCSET_SMALL_SIZES);
+   /* Success --- make the relcache point to the newly constructed key */
+   MemoryContextSetParent(partkeycxt, CacheMemoryContext);
relation->rd_partkeycxt = partkeycxt;
-   oldcxt = MemoryContextSwitchTo(relation->rd_partkeycxt);
-   relation->rd_partkey = copy_partition_key(key);
+   relation->rd_partkey = key;
MemoryContextSwitchTo(oldcxt);
 }
 
 /*
- * copy_partition_key
- *
- * The copy is allocated in the current memory context.
- */
-static PartitionKey
-copy_partition_key(PartitionKey fromkey)
-{
-   PartitionKey newkey;
-   int n;
-
-   newkey = (PartitionKey) palloc(sizeof(PartitionKeyData));
-
-   newkey->strategy = fromkey->strategy;
-   newkey->partnatts = n = fromkey->partnatts;
-
-   newkey->partattrs = (AttrNumber *) palloc(n * sizeof(AttrNumber));
-   memcpy(newkey->partattrs, fromkey->partattrs, n * sizeof(AttrNumber));
-
-   newkey->partexprs = copyObject(fromkey->partexprs);
-
-   newkey->partopfamily = (Oid *) palloc(n * sizeof(Oid));
-   memcpy(newkey->partopfamily, fromkey->partopfamily, n * sizeof(Oid));
-
-   newkey->partopcintype = (Oid *) palloc(n * sizeof(Oid));
-   memcpy(newkey->partopcintype, fromkey->partopcintype, n * sizeof(Oid));
-
-   newkey->partsupfunc = (FmgrInfo *) palloc(n * sizeof(FmgrInfo));
-   memcpy(newkey->partsupfunc, fromkey->partsupfunc, n * sizeof(FmgrInfo));
-
-   newkey->partcollation = (Oid *) palloc(n * sizeof(Oid));
-   memcpy(newkey->partcollation, fromkey->partcollation, n * sizeof(Oid));
-
-   newkey->parttypid = (Oid *) palloc(n * sizeof(Oid));
-   memcpy(newkey->parttypid, fromkey->parttypid, n * sizeof(Oid));
-
-   newkey->parttypmod = (int32 *) palloc(n * sizeof(int32));
-   memcpy(newkey->parttypmod, fromkey->parttypmod, n * sizeof(int32));
-
-   newkey->parttyplen = (int16 *) palloc(n * sizeof(int16));
-   memcpy(newkey->parttyplen, fromkey->parttyplen, n * sizeof(int16));
-
-   

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread Alvaro Herrera
Simon Riggs wrote:
> On 13 November 2017 at 12:55, Alvaro Herrera  wrote:
> > Somehow I managed to include an unrelated patch as attachment.  Here's
> > another attempt (on which I also lightly touched ddl.sgml with relevant
> > changes).
> 
> Looks good. Some minor comments below.
> 
> 0001- Simplify
> Seems useful as separate step; agree with everything, no further comments

Thanks, pushed.

> Why uint16? Why not just uint?

*Shrug*.  16 bits seem plenty enough.  I changed it to bits16, which we
use in other places for bitmasks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread Alvaro Herrera
David Rowley wrote:
> On 15 November 2017 at 01:09, Alvaro Herrera  wrote:
> > if a
> > partition exists which *doesn't* have the index, restoring things this
> > way would create the index in that partition too, which is unwanted
> > because the end state is different to what was in the dumped database.
> 
> hmm, but surely the all those indexes must already exist if the
> partitioned index exists. Won't we be disallowing DROP INDEX of the
> leaf partition indexes if that index is marked as being part of the
> partitioned index?

In normal cases, sure -- but you can do ALTER INDEX DETACH on a
partition, then drop the index.  This is useful for example if you want
to replace some partition's index because of bloat: create a replacement
index, detach the old one, attach the new one, drop the old one.  Now
you probably don't *want* to take a pg_dump in the middle of this
process ...  but pg_dump's charter is not to produce the output we think
would be most convenient to users most of the time, but rather to
accurately describe what is in the database.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread David Rowley
On 15 November 2017 at 01:09, Alvaro Herrera  wrote:
> if a
> partition exists which *doesn't* have the index, restoring things this
> way would create the index in that partition too, which is unwanted
> because the end state is different to what was in the dumped database.

hmm, but surely the all those indexes must already exist if the
partitioned index exists. Won't we be disallowing DROP INDEX of the
leaf partition indexes if that index is marked as being part of the
partitioned index?

If so, then surely this could only happen if someone manually edited
the pg_dump file to remove the CREATE INDEX statement for the leaf
partition, and if they do that, then maybe they won't be so surprised
that CREATE INDEX has to create some indexes.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-14 Thread David Rowley
On 8 October 2017 at 02:34, Robert Haas  wrote:
> However, when
> you dump-and-restore (including but not limited to the pg_upgrade
> case) you've got to preserve those names.  If you just generate a new
> name that may or may not be the same as the old one, then it may
> collide with a user-specified name that only occurs later in the dump.
> Also, you'll have trouble if the user has applied a COMMENT or a
> SECURITY LABEL to the index because that command works by name, or if
> the user has a reference to the index name inside a function or
> whatever.

I might be arriving late for the party here, but I see mention on here
of when we ATTACH a partition that we'd check to see if any of that
table's indexes match the index definition required for the
partitioned table's "partitioned index", and just use those indexes
and incorporate them into the partitioned index by setting the new OID
column in pg_index (or whatever way was decided to record the
dependency)... So, if that works for ATTACH, then can't pg_dump just
create each of the partition's indexes first, then create the
partitioned table's partition index, and that command would just go
through checking each partition similar to how ATTACH would work. That
way we get to keep all the names as pg_dump would just dump the
indexes belonging to each partition like it were any other index.
Those indexes would only become a bit more special once the
partitioned index is created on the parent.

I've not read the patch yet, but reading the thread it does not seem
to have gone in this direction. Maybe I'm overlooking something?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-13 Thread Simon Riggs
On 13 November 2017 at 12:55, Alvaro Herrera  wrote:
> Somehow I managed to include an unrelated patch as attachment.  Here's
> another attempt (on which I also lightly touched ddl.sgml with relevant
> changes).

Looks good. Some minor comments below.

0001- Simplify
Seems useful as separate step; agree with everything, no further comments
Why uint16? Why not just uint?

0003-export
I don't like Assert((heapRel != NULL) ^ OidIsValid(heapRelid));
Standard boolean logic is clearer
I couldn't see why this was a separate patch

0004-Allow
If we do ATTACH PARTITION, do we also need to do a separate ATTACH
INDEX step to allow an index to join the main index? Hopefully not.
The tests seem to indicate to me that it does work that way, just no
docs in altertable.sgml to describe that
typo "they all have a matching indexes" - no plural needed
typo "whether equivalent" - insert "an"
unsure what "regardless of whether this option was specified" means,
probably just remove

0005-Allow
RelationCacheInitializePhase3() asserts that rd_partkey is not NULL
for a partitoned table after RelationBuildPartitionKey() runs

You removed the test to check whether "create unique index" works, not
sure if that was intentional
There is no test for attach index to a unique index

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services