Re: propagating replica identity to partitions

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
I've not had time to work on this issue, and there are a few remaining
problems in the patch; mostly that I will need to change the way pg_dump
represents the replica identity so that it is only defined when the
index is dumped in all partitions rather than immediately after creating
the index.  Also, the bug reported by Dmitry Dolgov.

So I'm marking this as Returned with Feedback, with the intention to
resubmit for November.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: propagating replica identity to partitions

2019-04-04 Thread Alvaro Herrera
On 2019-Mar-29, Peter Eisentraut wrote:

> On 2019-03-28 18:16, Simon Riggs wrote:
> > SET TABLESPACE should not recurse because it copies the data, while
> > holding long locks. If that was ever fixed so it happened concurrently,
> > I would agree this could recurse by default.
> 
> Since a partitioned table has no storage, what is the meaning of moving
> a partitioned table to a different tablespace without recursing?

Changing the tablespace of a partitioned table currently means to set a
default tablespace for partitions created after the change.

Robert proposed to change it so that it means to move every partition to
that tablespace, unless ONLY is specified.  Simon objects to that change.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: propagating replica identity to partitions

2019-03-29 Thread Simon Riggs
On Fri, 29 Mar 2019 at 09:51, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-28 18:16, Simon Riggs wrote:
> > SET TABLESPACE should not recurse because it copies the data, while
> > holding long locks. If that was ever fixed so it happened concurrently,
> > I would agree this could recurse by default.
>
> Since a partitioned table has no storage, what is the meaning of moving
> a partitioned table to a different tablespace without recursing?
>

My point was that people will misuse it and regret at length, but the data
copying behavior is clearly documented.

If you are saying there is no other possible interpretation of that
command, then I relent. It is important we have a way to do this, if people
wish it.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: propagating replica identity to partitions

2019-03-29 Thread Peter Eisentraut
On 2019-03-28 18:16, Simon Riggs wrote:
> SET TABLESPACE should not recurse because it copies the data, while
> holding long locks. If that was ever fixed so it happened concurrently,
> I would agree this could recurse by default.

Since a partitioned table has no storage, what is the meaning of moving
a partitioned table to a different tablespace without recursing?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: propagating replica identity to partitions

2019-03-29 Thread Peter Eisentraut
On 2019-03-28 17:46, Alvaro Herrera wrote:
> Thanks, Michael and Peter, for responding; however there is a second
> part to the question, which is "should I change the recursivity of
> REPLICA IDENTITY, while not simultaneously changing the recusivity of
> the TABLESPACE and OWNER TO forms of ALTER TABLE?"
> 
> I think everyone agrees that REPLICA IDENTITY should be changed.  The
> question is whether it's bad behavior from my part to change it
> separately from changing every other non-currently-recursive form.

If this were new functionality, I would tend to think that ALTER TABLE
should by default recurse for everything.  I don't know the history of
why it is not done for some things.  It might be a mistake, just like
the replica identity case apparently is.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: propagating replica identity to partitions

2019-03-28 Thread Simon Riggs
On Thu, 28 Mar 2019 at 16:46, Alvaro Herrera 
wrote:

> Thanks, Michael and Peter, for responding; however there is a second
> part to the question, which is "should I change the recursivity of
> REPLICA IDENTITY, while not simultaneously changing the recusivity of
> the TABLESPACE and OWNER TO forms of ALTER TABLE?"
>
> I think everyone agrees that REPLICA IDENTITY should be changed.


+1

This is a metadata-only operation, so no problem.


> The
> question is whether it's bad behavior from my part to change it
> separately from changing every other non-currently-recursive form.
>

Changing OWNER is also just a metadata-only operation and makes sense to me
to recurse, by default.

SET TABLESPACE should not recurse because it copies the data, while holding
long locks. If that was ever fixed so it happened concurrently, I would
agree this could recurse by default.
IMHO this should be renamed to ALTER TABLE ... MOVE TO TABLESPACE, so its
actual effect is clearer.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: propagating replica identity to partitions

2019-03-28 Thread Simon Riggs
On Sat, 23 Mar 2019 at 01:34, Michael Paquier  wrote:

> I'm slightly baffled that we would even allow having different owners on
> > different partitions, but that seems to be a separate discussion.
>
> Different owners can make sense for multiple layers of partitions
> where the children have less restrictions than the children.  Imagine
> for example a table listing the population of a country, with children
> partitioned by regions, and grand-children partitioned by cities.  The
> top-most parent could be owned by a minister, and lower levels apply
> to the region administrator, down to the city administrators.
>

That use case is possible using different privileges.

Having different owners makes it *very* difficult to administer.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: propagating replica identity to partitions

2019-03-28 Thread Alvaro Herrera
Thanks, Michael and Peter, for responding; however there is a second
part to the question, which is "should I change the recursivity of
REPLICA IDENTITY, while not simultaneously changing the recusivity of
the TABLESPACE and OWNER TO forms of ALTER TABLE?"

I think everyone agrees that REPLICA IDENTITY should be changed.  The
question is whether it's bad behavior from my part to change it
separately from changing every other non-currently-recursive form.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: propagating replica identity to partitions

2019-03-22 Thread Michael Paquier
On Fri, Mar 22, 2019 at 07:55:11PM +0100, Peter Eisentraut wrote:
> If you are operating on a partitioned table and set the replica identity
> to the primary key or a partitioned index of that partitioned table,
> then I think, by definition of what it means to be a partitioned index,
> that applies to the whole partition hierarchy.
> 
> Aside from that theoretical consideration, what would be the practical
> use of not doing that?

FWIW I agree about the part of inheriting the replica identity of the
parent when defining a partition on it.  That's also..  Instinctive.

> I'm slightly baffled that we would even allow having different owners on
> different partitions, but that seems to be a separate discussion.

Different owners can make sense for multiple layers of partitions
where the children have less restrictions than the children.  Imagine
for example a table listing the population of a country, with children
partitioned by regions, and grand-children partitioned by cities.  The
top-most parent could be owned by a minister, and lower levels apply
to the region administrator, down to the city administrators.
--
Michael


signature.asc
Description: PGP signature


Re: propagating replica identity to partitions

2019-03-22 Thread Peter Eisentraut
On 2019-03-22 17:52, Alvaro Herrera wrote:
> To recap: my proposed change is to make
> ALTER TABLE ... REPLICA IDENTITY
> when applied on a partitioned table affect all of its partitions instead
> of expecting the user to invoke the command for each partition.

If you are operating on a partitioned table and set the replica identity
to the primary key or a partitioned index of that partitioned table,
then I think, by definition of what it means to be a partitioned index,
that applies to the whole partition hierarchy.

Aside from that theoretical consideration, what would be the practical
use of not doing that?

> At the
> same time, I am proposing not to change to have recursive behavior other
> forms of ALTER TABLE in one commit, such as TABLESPACE and OWNER TO,
> which currently do not have recursive behavior.

I'm slightly baffled that we would even allow having different owners on
different partitions, but that seems to be a separate discussion.

In general, it seems sensible that if you operate on a partitioned
table, the whole partition hierarchy is affected unless told otherwise.
There may be sensible exceptions, but it seems useful as the default.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-03-22 Thread Alvaro Herrera
On 2019-Mar-22, Robert Haas wrote:

> On Thu, Mar 21, 2019 at 5:01 PM Alvaro Herrera  
> wrote:
> > I already argued that TABLESPACE and OWNER TO are documented to work
> > that way, and have been for a long time, whereas REPLICA IDENTITY has
> > never been.  If you want to change long-standing behavior, be my guest,
> > but that's not my patch.  On the other hand, there's no consensus that
> > those should be changed, whereas there no opposition specifically
> > against changing this one, and in fact it was reported as a bug to me by
> > actual users.
> 
> Well, you have a commit bit, and I cannot prevent you from using it,
> and nobody else is backing me up here, but it doesn't change my
> opinion.

Can I ask for more opinions here?  Should I apply this behavior change
to pg12 or not?  If there's no consensus that it should be changed, I
wouldn't change it.

To recap: my proposed change is to make
ALTER TABLE ... REPLICA IDENTITY
when applied on a partitioned table affect all of its partitions instead
of expecting the user to invoke the command for each partition.  At the
same time, I am proposing not to change to have recursive behavior other
forms of ALTER TABLE in one commit, such as TABLESPACE and OWNER TO,
which currently do not have recursive behavior.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-03-22 Thread Robert Haas
On Thu, Mar 21, 2019 at 5:01 PM Alvaro Herrera  wrote:
> I already argued that TABLESPACE and OWNER TO are documented to work
> that way, and have been for a long time, whereas REPLICA IDENTITY has
> never been.  If you want to change long-standing behavior, be my guest,
> but that's not my patch.  On the other hand, there's no consensus that
> those should be changed, whereas there no opposition specifically
> against changing this one, and in fact it was reported as a bug to me by
> actual users.

Well, you have a commit bit, and I cannot prevent you from using it,
and nobody else is backing me up here, but it doesn't change my
opinion.

I think it is HIGHLY irresponsible for you to try to characterize
clear behavior changes in this area as "bug fixes."  The fact that a
user say that something is a bug does not make it a bug -- and you
have been a committer long enough to know the difference between
repairing a defect and inventing entirely new behavior.  Yet you keep
doing the latter and calling the former.

Commit 33e6c34c32677a168bee4bc6c335aa8d73211a56 is a clear behavior
change for partitioned indexes and yet has the innocuous subject line
"Fix tablespace handling for partitioned indexes."  Unbelievably, it
was back-patched into 11.1.  Everyone except you agreed that it
created an inconsistency between tables and indexes, so commit
ca4103025dfe26eaaf6a500dec9170fbb176eebc repaired that by doing the
same thing for tables.  That one wasn't back-patched, but it was still
described as "Fix tablespace handling for partitioned tables", even
though I said repeatedly that it wasn't a fix, because the actual
behavior was the design behavior, and as the major reviewer and
committer of those patches I ought to know.  That latter patch also
broke stuff which it looks like you haven't fixed yet:

https://www.postgresql.org/message-id/cakjs1f_1c260not_vbj067az3jxptxvrohdvmlebmudx1ye...@mail.gmail.com

That email thread even includes clear definitional concerns about
whether this behavior is even properly designed:

https://www.postgresql.org/message-id/20190306161744.22jdkg37fyi2zyke%40alap3.anarazel.de

But I assume that's not going to stop you from propagating the same
kinds of problems into more places.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-03-21 Thread Alvaro Herrera
On 2019-Mar-21, Robert Haas wrote:

> On Wed, Mar 20, 2019 at 4:42 PM Alvaro Herrera  
> wrote:
> > Unless there are any objections to fixing the REPLICA IDENTITY bug, I
> > intend to push that tomorrow.
> 
> I still think that this is an ill-considered, piecemeal approach to a
> problem that deserves a better solution.

I already argued that TABLESPACE and OWNER TO are documented to work
that way, and have been for a long time, whereas REPLICA IDENTITY has
never been.  If you want to change long-standing behavior, be my guest,
but that's not my patch.  On the other hand, there's no consensus that
those should be changed, whereas there no opposition specifically
against changing this one, and in fact it was reported as a bug to me by
actual users.

FWIW I refrained from pushing today because after posting I realized
that I've failed to investigate Dolgov's reported problem.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-03-21 Thread Robert Haas
On Wed, Mar 20, 2019 at 4:42 PM Alvaro Herrera  wrote:
> Unless there are any objections to fixing the REPLICA IDENTITY bug, I
> intend to push that tomorrow.

I still think that this is an ill-considered, piecemeal approach to a
problem that deserves a better solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-03-20 Thread Alvaro Herrera
Unless there are any objections to fixing the REPLICA IDENTITY bug, I
intend to push that tomorrow.

People can continue to discuss changing the behavior of other
subcommands where reasonable (OWNER TO) or even for the cases others
consider not reasonable (TABLESPACE), but there is no consensus of doing
that, and no patches either.  I suppose those changes (in the name of
uncompromising consistency) will have to wait till pg13.

Thanks

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-03-14 Thread Peter Eisentraut
On 2019-02-28 23:41, Alvaro Herrera wrote:
> Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
> really sure about this one.  Peter?

No, the intention is that only the partition root has the identity
property.  If you wanted to make it recurse, then you'd need to arrange
it so that all partitions refer to the same sequence, but that's
currently not possible.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-03-01 Thread Robert Haas
On Thu, Feb 28, 2019 at 6:13 PM Alvaro Herrera  wrote:
> Tablespaces already behave a little bit especially in another sense:
> it doesn't recurse to indexes.  I think it's not possible to implement a
> three-way flag, where you tell it to move only the table, or to recurse
> to child tables but leave local indexes alone, or just to move
> everything.  If we don't move the child indexes, are we really recursing
> in that sense?

I don't quite follow this.  If you want to change the default for new
partitions, you would use ONLY, which updates the value for the parent
(which is cheap) but doesn't touch the children.  If you want to move
it all, you would omit ONLY.  I'm not sureI quite understand the third
case - what do you mean by moving the child tables but leaving the
local indexes alone?

> I think changing the behavior of tablespaces is likely to cause pain for
> people with existing scripts that expect the command not to recurse.  We
> would have to add a switch of some kind to provide the old behavior in
> order not to break those, and that doesn't seem so good either.
>
> I agree we definitely want to treat a partitioned table as similarly as
> possible to a non-partitioned table, but in the particular case of
> tablespace it seems to me better not to mess with the behavior.

You may be right, but I'm not 100% convinced.  I don't understand why
changing the behavior for tablespaces would be a grant shocker and
changing the behavior for OWNER TO would be a big nothing.  If you
mess up the first one, you'll realize it's running for too long and go
"oh, oops" and fix it next time.  If you mess up the second one,
you'll have a security hole, be exploited by hackers, suffer civil and
criminal investigations due to a massive data security breach, and end
your life in penury and in prison, friendless and alone.  Or maybe
not, but the point is that BOTH of these things have an established
behavior such that changing it may surprise some people, and actually
I would argue that the surprise for the TABLESPACE will tend to be
more obvious and less likely to have subtle, unintended consequences.

I hate to open a can of worms here, but there's also the relationship
between OWNER TO and GRANT/REVOKE; that might also need a little
thought.

> CLUSTER: I'm not sure about this one.  For legacy inheritance there's
> no principled way to figure out the right index to recurse to.  For
> partitioning that seems a very simple problem, but I do wonder if
> there's any use case at all for ALTER TABLE CLUSTER there.  My
> inclination would be to reject ALTER TABLE CLUSTER on a partitioned
> table altogether, if it isn't already.

Hmm, I disagree.  I think this should work and set the value for the
whole hierarchy.  That seems well-defined and useful -- why is it any
less useful than for a standalone table?

> OWNER: let's just change this one to recurse always.  I think we have a
> consensus about this one.

We really don't have many votes either way, AFAICS.  I'm not direly
opposed, but I'd like to see a more vigorous attempt to make them ALL
recurse rather than changing one per release for the next 8 years.

> TRIGGER: I think it would be good to recurse, based on the trigger name.
> I'm not sure what to do about legacy inheritance; the trigger isn't
> likely to be there.  An option is to silently ignore each inheritance
> child (not partition) if the trigger isn't present on it.

Hmm, I think this one should maybe recurse to triggers that cascaded
downwards, which would of its nature apply to partitioning but not to
inheritance.

> We all seem to agree that REPLICA IDENTITY should recurse.

My feeling on this is the same as for OWNER.

> Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
> really sure about this one.  Peter?

I don't know enough about this to have an opinion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-02-28 Thread Michael Paquier
On Thu, Feb 28, 2019 at 07:41:11PM -0300, Alvaro Herrera wrote:
> We all seem to agree that REPLICA IDENTITY should recurse.

(entering the ring)
FWIW, I agree that having REPLICA IDENTITY recurse on partitions feels
more natural, as much as being able to use ALTER TABLE ONLY to only
update the definition on a given partition member.
--
Michael


signature.asc
Description: PGP signature


Re: propagating replica identity to partitions

2019-02-28 Thread Alvaro Herrera
Added Peter E to CC; question at the very end.

On 2019-Feb-20, Robert Haas wrote:

> Yeah, we could.  I wonder, though, whether we should just make
> everything recurse.  I think that's what people are commonly going to
> want, at least for partitioned tables, and it doesn't seem to me that
> it would hurt anything to make the inheritance case work that way,
> too.  Right now it looks like we have this list of exceptions:
> 
> - actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY)
> - TRIGGER
> - CLUSTER
> - OWNER
> - TABLESPACE never recurse to descendant tables
> - Adding a constraint recurses only for CHECK constraints that are not
> marked NO INHERIT.
> 
> I have a feeling we eventually want this list to be empty, right?  We
> want a partitioned table to work as much like a non-partitioned table
> as possible, unless the user asks for some other behavior.  Going from
> six exceptions to four and maybe having some of them depend on whether
> it's partitioning or inheritance doesn't seem like it's as good and
> clear as trying to adopt a really uniform policy.
> 
> I don't buy Simon's argument that we should treat TABLESPACE
> differently because the tables might be really big and take a long
> time to move.  I agree that this could well be true, but nobody is
> proposing to remove the ability to move tables individually or to use
> ONLY here.  Allowing TABLESPACE to recurse just gives people one
> additional choice that they don't have today: to move everything at
> once. We don't lose any functionality by enabling that.

Tablespaces already behave a little bit especially in another sense:
it doesn't recurse to indexes.  I think it's not possible to implement a
three-way flag, where you tell it to move only the table, or to recurse
to child tables but leave local indexes alone, or just to move
everything.  If we don't move the child indexes, are we really recursing
in that sense?

I think changing the behavior of tablespaces is likely to cause pain for
people with existing scripts that expect the command not to recurse.  We
would have to add a switch of some kind to provide the old behavior in
order not to break those, and that doesn't seem so good either.

I agree we definitely want to treat a partitioned table as similarly as
possible to a non-partitioned table, but in the particular case of
tablespace it seems to me better not to mess with the behavior.


CLUSTER: I'm not sure about this one.  For legacy inheritance there's
no principled way to figure out the right index to recurse to.  For
partitioning that seems a very simple problem, but I do wonder if
there's any use case at all for ALTER TABLE CLUSTER there.  My
inclination would be to reject ALTER TABLE CLUSTER on a partitioned
table altogether, if it isn't already.

OWNER: let's just change this one to recurse always.  I think we have a
consensus about this one.

TRIGGER: I think it would be good to recurse, based on the trigger name.
I'm not sure what to do about legacy inheritance; the trigger isn't
likely to be there.  An option is to silently ignore each inheritance
child (not partition) if the trigger isn't present on it.

We all seem to agree that REPLICA IDENTITY should recurse.

Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
really sure about this one.  Peter?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-28 Thread Alvaro Herrera
On 2019-Feb-20, Robert Haas wrote:

> On Wed, Feb 20, 2019 at 12:38 PM Alvaro Herrera
>  wrote:

> > Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
> > separate para.  I suggest:
> >
> > :   This form changes the table's tablespace to the specified tablespace
> > :   and moves the data file(s) associated with the table to the new
> > :   tablespace. Indexes on the table, if any, are not moved; but they
> > :   can be moved separately with additional SET TABLESPACE commands.
> > :   When applied to a partitioned table, nothing is moved, but any
> > :   partitions created afterwards with CREATE TABLE PARTITION OF
> > :   will use that tablespace.
> > :
> > :   All
> > :   tables in the current database in a tablespace can be moved by using
> > :   the ALL IN TABLESPACE form, which will lock all tables to be moved
> > :   first and then move each one. This form also supports OWNED BY,
> > :   which will only move tables owned by the roles specified. If the
> > :   NOWAIT option is specified then the command will fail if it is
> > :   unable to acquire all of the locks required immediately. Note that
> > :   system catalogs are not moved by this command, use ALTER DATABASE or
> > :   explicit ALTER TABLE invocations instead if desired. The
> > :   information_schema relations are not considered part of the system
> > :   catalogs and will be moved. See also CREATE TABLESPACE.
> 
> Seems reasonable.

Pushed this bit, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-20 Thread Simon Riggs
On Wed, 20 Feb 2019 at 18:51, Robert Haas  wrote:

I don't buy Simon's argument that we should treat TABLESPACE
> differently because the tables might be really big and take a long
> time to move.  I agree that this could well be true, but nobody is
> proposing to remove the ability to move tables individually or to use
> ONLY here.  Allowing TABLESPACE to recurse just gives people one
> additional choice that they don't have today: to move everything at
> once. We don't lose any functionality by enabling that.
>

Doing that would add the single largest footgun in Postgres, given that
command's current behavior and the size of partitioned tables.

If it moved partitions concurrently I'd feel differently.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: propagating replica identity to partitions

2019-02-20 Thread Robert Haas
On Wed, Feb 20, 2019 at 12:38 PM Alvaro Herrera
 wrote:
> > I don't see that in the NOTES section for ALTER TABLE.  Are you
> > looking at some patch, or at master?
>
> I was looking here:
> https://www.postgresql.org/docs/11/sql-altertable.html

OK, I was looking at the wrong thing.  Not enough caffeine, apparently.

> Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
> separate para.  I suggest:
>
> :   This form changes the table's tablespace to the specified tablespace
> :   and moves the data file(s) associated with the table to the new
> :   tablespace. Indexes on the table, if any, are not moved; but they
> :   can be moved separately with additional SET TABLESPACE commands.
> :   When applied to a partitioned table, nothing is moved, but any
> :   partitions created afterwards with CREATE TABLE PARTITION OF
> :   will use that tablespace.
> :
> :   All
> :   tables in the current database in a tablespace can be moved by using
> :   the ALL IN TABLESPACE form, which will lock all tables to be moved
> :   first and then move each one. This form also supports OWNED BY,
> :   which will only move tables owned by the roles specified. If the
> :   NOWAIT option is specified then the command will fail if it is
> :   unable to acquire all of the locks required immediately. Note that
> :   system catalogs are not moved by this command, use ALTER DATABASE or
> :   explicit ALTER TABLE invocations instead if desired. The
> :   information_schema relations are not considered part of the system
> :   catalogs and will be moved. See also CREATE TABLESPACE.

Seems reasonable.

> I think the reason tablespace was made not to recurse was to avoid
> acquiring access exclusive lock on too many tables at once, but I'm not
> sure.  This is very old behavior.
>
> > Obviously any property where the
> > parents and children have to match, like adding a column or changing a
> > data type, has to always recurse; nothing else is sensible.  For
> > others, it seems we have a choice.  It's unclear to me why we'd choose
> > to make REPLICA IDENTITY recurse by default and, say, OWNER not
> > recurse.
>
> Ah.  I did argue that OWNER should recurse:
> https://postgr.es/m/20171017163203.uw7hmlqonidlfeqj@alvherre.pgsql
> but it was too late then to change it for pg10.  We can change it now,
> surely.

Yeah, we could.  I wonder, though, whether we should just make
everything recurse.  I think that's what people are commonly going to
want, at least for partitioned tables, and it doesn't seem to me that
it would hurt anything to make the inheritance case work that way,
too.  Right now it looks like we have this list of exceptions:

- actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY)
- TRIGGER
- CLUSTER
- OWNER
- TABLESPACE never recurse to descendant tables
- Adding a constraint recurses only for CHECK constraints that are not
marked NO INHERIT.

I have a feeling we eventually want this list to be empty, right?  We
want a partitioned table to work as much like a non-partitioned table
as possible, unless the user asks for some other behavior.  Going from
six exceptions to four and maybe having some of them depend on whether
it's partitioning or inheritance doesn't seem like it's as good and
clear as trying to adopt a really uniform policy.

I don't buy Simon's argument that we should treat TABLESPACE
differently because the tables might be really big and take a long
time to move.  I agree that this could well be true, but nobody is
proposing to remove the ability to move tables individually or to use
ONLY here.  Allowing TABLESPACE to recurse just gives people one
additional choice that they don't have today: to move everything at
once. We don't lose any functionality by enabling that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-02-20 Thread Simon Riggs
On Wed, 20 Feb 2019 at 17:38, Alvaro Herrera 
wrote:


> > Fair enough.  I don't think it's entirely useful to think about this
> > in terms of which operations do and don't recurse; the question in my
> > mind is WHY some things recurse and other things don't, and what will
> > be easiest for users to understand.
>
> I think the reason tablespace was made not to recurse was to avoid
> acquiring access exclusive lock on too many tables at once, but I'm not
> sure.  This is very old behavior.
>

-1 for changing that; here's why:

Partitioning is designed to support very large tables.

Since the table is very large there is a valid use case for moving older
partitions to other tablespaces, one at a time.

If we moved all partitions at once, while holding AccessExclusiveLock, it
would a) likely fail with out of space errors, b) if you are unlucky enough
for it to succeed it would lock the table for a ridiculously long time. The
main use case for changing the tablespace is to define where new partitions
should be created. So in this specific case only, recursion makes no sense.


> > Obviously any property where the
> > parents and children have to match, like adding a column or changing a
> > data type, has to always recurse; nothing else is sensible.  For
> > others, it seems we have a choice.  It's unclear to me why we'd choose
> > to make REPLICA IDENTITY recurse by default and, say, OWNER not
> > recurse.
>
> Ah.  I did argue that OWNER should recurse:
> https://postgr.es/m/20171017163203.uw7hmlqonidlfeqj@alvherre.pgsql
> but it was too late then to change it for pg10.  We can change it now,
> surely.
>
> > In both cases, the default assumption is presumably that the
> > whole partitioning hierarchy should match, but in a particular case a
> > user could want to do something else.  Consequently, I don't
> > understand why a user would want one of those operations to descend to
> > children by default and the other not.
>
> I agree that OWNER TO should recurse for partitioned tables.


+1

That was clearly just missed. It's a strange thought to have a partitioned
table with different pieces owned by different users. So strange that the
lack of complaints about the recursion are surely because nobody ever tried
it in a real situation. I would prefer to disallow differences in ownership
across partitions, since that almost certainly prevents running sensible
DDL and its just a huge footgun.

Recursion should be the default for partitioned tables.

I'm -0 on
> changing it for legacy inheritance, but if the majority is to change it
> for both, let's do that.
>

-1 for changing legacy inheritance. Two separate features. Inheritance has
been around for many years and its feature set is stable. No need to touch
it.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: propagating replica identity to partitions

2019-02-20 Thread Alvaro Herrera
On 2019-Feb-20, Robert Haas wrote:

> On Tue, Feb 19, 2019 at 5:41 PM Alvaro Herrera  
> wrote:
> > OK, let me concede that point -- it's not rewriting that makes things
> > act differently, but rather TABLESPACE (as well as some other things)
> > behave that way.  ALTER TABLE ... SET DATA TYPE is the obvious
> > counterexample.
> >
> > The Notes section of ALTER TABLE says:
> >
> > : The actions for identity columns (ADD GENERATED, SET etc., DROP 
> > IDENTITY), as
> > : well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse 
> > to
> > : descendant tables; that is, they always act as though ONLY were specified.
> > : Adding a constraint recurses only for CHECK constraints that are not 
> > marked NO
> > : INHERIT.
> 
> I don't see that in the NOTES section for ALTER TABLE.  Are you
> looking at some patch, or at master?

I was looking here:
https://www.postgresql.org/docs/11/sql-altertable.html

> More generally, our documentation in this area seems to be somewhat
> lacking.  For instance, the TABLESPACE section of the ALTER TABLE
> documentation appears to make no mention of what exactly the behavior
> is when applied to a partition table.  It doesn't seem sufficient to
> simply say "well, it doesn't recurse," because that would imply that
> it doesn't do anything at all, which isn't the case.

That's true.  Maybe we can add a short blurb in the stanza for the SET
TABLESPACE form in the Description section.  It currently says:

:   This form changes the table's tablespace to the specified tablespace
:   and moves the data file(s) associated with the table to the new
:   tablespace. Indexes on the table, if any, are not moved; but they
:   can be moved separately with additional SET TABLESPACE commands. All
:   tables in the current database in a tablespace can be moved by using
:   the ALL IN TABLESPACE form, which will lock all tables to be moved
:   first and then move each one. This form also supports OWNED BY,
:   which will only move tables owned by the roles specified. If the
:   NOWAIT option is specified then the command will fail if it is
:   unable to acquire all of the locks required immediately. Note that
:   system catalogs are not moved by this command, use ALTER DATABASE or
:   explicit ALTER TABLE invocations instead if desired. The
:   information_schema relations are not considered part of the system
:   catalogs and will be moved. See also CREATE TABLESPACE.

Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
separate para.  I suggest:

:   This form changes the table's tablespace to the specified tablespace
:   and moves the data file(s) associated with the table to the new
:   tablespace. Indexes on the table, if any, are not moved; but they
:   can be moved separately with additional SET TABLESPACE commands.
:   When applied to a partitioned table, nothing is moved, but any
:   partitions created afterwards with CREATE TABLE PARTITION OF
:   will use that tablespace.
:
:   All
:   tables in the current database in a tablespace can be moved by using
:   the ALL IN TABLESPACE form, which will lock all tables to be moved
:   first and then move each one. This form also supports OWNED BY,
:   which will only move tables owned by the roles specified. If the
:   NOWAIT option is specified then the command will fail if it is
:   unable to acquire all of the locks required immediately. Note that
:   system catalogs are not moved by this command, use ALTER DATABASE or
:   explicit ALTER TABLE invocations instead if desired. The
:   information_schema relations are not considered part of the system
:   catalogs and will be moved. See also CREATE TABLESPACE.


> > Since REPLICA IDENTITY does not appear in this list, the documented
> > behavior is to recurse, per the description of the "name" parameter:
> >
> > : The name (optionally schema-qualified) of an existing table to
> > : alter. If ONLY is specified before the table name, only that table
> > : is altered. If ONLY is not specified, the table and all its
> > : descendant tables (if any) are altered. Optionally, * can be
> > : specified after the table name to explicitly indicate that
> > : descendant tables are included.
> >
> > I didn't come up with this on my own, as you imply.
> 
> Fair enough.  I don't think it's entirely useful to think about this
> in terms of which operations do and don't recurse; the question in my
> mind is WHY some things recurse and other things don't, and what will
> be easiest for users to understand.

I think the reason tablespace was made not to recurse was to avoid
acquiring access exclusive lock on too many tables at once, but I'm not
sure.  This is very old behavior.

> Obviously any property where the
> parents and children have to match, like adding a column or changing a
> data type, has to always recurse; nothing else is sensible.  For
> others, it seems we have a choice.  It's unclear to me why we'd choose
> to make REPLICA IDENTITY recurse by default and, 

Re: propagating replica identity to partitions

2019-02-20 Thread Robert Haas
On Tue, Feb 19, 2019 at 5:41 PM Alvaro Herrera  wrote:
> OK, let me concede that point -- it's not rewriting that makes things
> act differently, but rather TABLESPACE (as well as some other things)
> behave that way.  ALTER TABLE ... SET DATA TYPE is the obvious
> counterexample.
>
> The Notes section of ALTER TABLE says:
>
> : The actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY), 
> as
> : well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse to
> : descendant tables; that is, they always act as though ONLY were specified.
> : Adding a constraint recurses only for CHECK constraints that are not marked 
> NO
> : INHERIT.

I don't see that in the NOTES section for ALTER TABLE.  Are you
looking at some patch, or at master?

More generally, our documentation in this area seems to be somewhat
lacking.  For instance, the TABLESPACE section of the ALTER TABLE
documentation appears to make no mention of what exactly the behavior
is when applied to a partition table.  It doesn't seem sufficient to
simply say "well, it doesn't recurse," because that would imply that
it doesn't do anything at all, which isn't the case.

> Since REPLICA IDENTITY does not appear in this list, the documented
> behavior is to recurse, per the description of the "name" parameter:
>
> : The name (optionally schema-qualified) of an existing table to
> : alter. If ONLY is specified before the table name, only that table
> : is altered. If ONLY is not specified, the table and all its
> : descendant tables (if any) are altered. Optionally, * can be
> : specified after the table name to explicitly indicate that
> : descendant tables are included.
>
> I didn't come up with this on my own, as you imply.

Fair enough.  I don't think it's entirely useful to think about this
in terms of which operations do and don't recurse; the question in my
mind is WHY some things recurse and other things don't, and what will
be easiest for users to understand.  Obviously any property where the
parents and children have to match, like adding a column or changing a
data type, has to always recurse; nothing else is sensible.  For
others, it seems we have a choice.  It's unclear to me why we'd choose
to make REPLICA IDENTITY recurse by default and, say, OWNER not
recurse.  In both cases, the default assumption is presumably that the
whole partitioning hierarchy should match, but in a particular case a
user could want to do something else.  Consequently, I don't
understand why a user would want one of those operations to descend to
children by default and the other not.  It feels like we're choosing
the behavior in individual cases, as Tom would say, with the aid of a
dartboard.  Maybe there's some coherent principle here that I'm just
missing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-02-19 Thread Alvaro Herrera
On 2019-Feb-19, Robert Haas wrote:

> On Tue, Feb 19, 2019 at 3:40 PM Alvaro Herrera  
> wrote:
> > Maybe we should be using the inheritance marker in the command to note
> > whether to recurse to partitions?  That is, if you say
> >   ALTER TABLE ONLY parent SET REPLICA IDENTITY
> > then we don't recurse and just change the parent table and future
> > partitions, whereas if you leave out the ONLY then it does affect
> > existing partitions.
> >
> > However, I think TABLESPACE and any other property that involves
> > rewriting tables wholesale can reasonably be expected to behave
> > differently (w.r.t. recursing) from commands that just modify the
> > catalogs.  I think we already have such a distinction somewhere.
> 
> I don't really follow why that should be different, or why the user
> should be expected to know or care whether a particular property
> involves rewriting.

OK, let me concede that point -- it's not rewriting that makes things
act differently, but rather TABLESPACE (as well as some other things)
behave that way.  ALTER TABLE ... SET DATA TYPE is the obvious
counterexample.

The Notes section of ALTER TABLE says:

: The actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY), as
: well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse to
: descendant tables; that is, they always act as though ONLY were specified.
: Adding a constraint recurses only for CHECK constraints that are not marked NO
: INHERIT.

Since REPLICA IDENTITY does not appear in this list, the documented
behavior is to recurse, per the description of the "name" parameter:

: The name (optionally schema-qualified) of an existing table to
: alter. If ONLY is specified before the table name, only that table
: is altered. If ONLY is not specified, the table and all its
: descendant tables (if any) are altered. Optionally, * can be
: specified after the table name to explicitly indicate that
: descendant tables are included.

I didn't come up with this on my own, as you imply.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-19 Thread Robert Haas
On Tue, Feb 19, 2019 at 3:40 PM Alvaro Herrera  wrote:
> Maybe we should be using the inheritance marker in the command to note
> whether to recurse to partitions?  That is, if you say
>   ALTER TABLE ONLY parent SET REPLICA IDENTITY
> then we don't recurse and just change the parent table and future
> partitions, whereas if you leave out the ONLY then it does affect
> existing partitions.
>
> However, I think TABLESPACE and any other property that involves
> rewriting tables wholesale can reasonably be expected to behave
> differently (w.r.t. recursing) from commands that just modify the
> catalogs.  I think we already have such a distinction somewhere.

I don't really follow why that should be different, or why the user
should be expected to know or care whether a particular property
involves rewriting.

> I have no argument against somebody doing that, but I don't think I have
> the resources to do in in time for 12; on the other hand, leaving a
> known misfeature unfixed because nobody has looked for hypothetical
> similar bugs would be bad.

Oh, come on.  If you don't have time to study the issue and come up
with a plan that can at least in theory be applied to all similar
cases in a principled way, whether or not you have time to apply it to
all of those cases, then you have no business committing anything at
all.  You're basically saying that you don't have time to do this
right, so you're just going to do something at random and hope it
doesn't create too many problems later.  That's terrible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-02-19 Thread Alvaro Herrera
On 2019-Feb-19, Robert Haas wrote:

> It's not unreasonable to use the parent's REPLICA IDENTITY setting as
> the default for new partitions, much as we now do for the TABLESPACE,
> because the parent's replica identity is otherwise without meaning.
> But I'm less convinced that it's reasonable to have changing the
> parent's replica identity recurse to existing children, because:
> 
> 1. That's different from the TABLESPACE case.  We shouldn't just treat
> each new instance of this problem as a green field where we can pick
> any old behavior at random; it should be consistent as far as
> reasonable,

Maybe we should be using the inheritance marker in the command to note
whether to recurse to partitions?  That is, if you say
  ALTER TABLE ONLY parent SET REPLICA IDENTITY
then we don't recurse and just change the parent table and future
partitions, whereas if you leave out the ONLY then it does affect
existing partitions.

However, I think TABLESPACE and any other property that involves
rewriting tables wholesale can reasonably be expected to behave
differently (w.r.t. recursing) from commands that just modify the
catalogs.  I think we already have such a distinction somewhere.

EXPLAIN ALTER TABLE would sure be handy :-)

> 2. It confuses a change to the default for new partitions with a
> change to the value for existing partitions.  Say that you've got 5
> existing partitions that use one setting, but now you want new
> partitions to use a different setting.  You can't get that with your
> proposed semantics, because trying to change the default will change
> all the existing partitions to the new value also, even though that's
> not what you wanted.

If we make sure to heed ONLY (and I admit to not thinking about it in my
original patch) then I think this angle is covered.

> We should really, really try to figure out all of the things that are
> like this -- a property that is meaningless for the partitioned table
> itself but may serve as a useful default for its children -- and try
> to make them all work the same, ideally in one release, rather than
> changing them at different times, back-patching sometimes, and having
> no consistency in the details.

I have no argument against somebody doing that, but I don't think I have
the resources to do in in time for 12; on the other hand, leaving a
known misfeature unfixed because nobody has looked for hypothetical
similar bugs would be bad.

I'm not proposing to back-patch this, but I would love it if I can
borrow somebody's time machine to retroactively fix it for 11.0.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-19 Thread Robert Haas
On Mon, Feb 4, 2019 at 11:30 AM Alvaro Herrera  wrote:
> If you do ALTER TABLE .. REPLICA IDENTITY to a partitioned table, the
> command operates on the parent table itself and does not propagate to
> partitions.  Why is this?  Maybe not recursing was the right call when
> we only had regular inheritance (back in 9.4), but since partitioned
> tables got introduced, I think it is completely the other way around:
> not recursing is an usability fail.

This sounds an awful like the TABLESPACE case.  I wonder how many more
similar things there are.

It's not unreasonable to use the parent's REPLICA IDENTITY setting as
the default for new partitions, much as we now do for the TABLESPACE,
because the parent's replica identity is otherwise without meaning.
But I'm less convinced that it's reasonable to have changing the
parent's replica identity recurse to existing children, because:

1. That's different from the TABLESPACE case.  We shouldn't just treat
each new instance of this problem as a green field where we can pick
any old behavior at random; it should be consistent as far as
reasonable, and

2. It confuses a change to the default for new partitions with a
change to the value for existing partitions.  Say that you've got 5
existing partitions that use one setting, but now you want new
partitions to use a different setting.  You can't get that with your
proposed semantics, because trying to change the default will change
all the existing partitions to the new value also, even though that's
not what you wanted.

We should really, really try to figure out all of the things that are
like this -- a property that is meaningless for the partitioned table
itself but may serve as a useful default for its children -- and try
to make them all work the same, ideally in one release, rather than
changing them at different times, back-patching sometimes, and having
no consistency in the details.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: propagating replica identity to partitions

2019-02-18 Thread Dmitry Dolgov
> On Fri, Feb 15, 2019 at 10:02 PM Alvaro Herrera  
> wrote:
>
> On 2019-Feb-07, Dmitry Dolgov wrote:
>
> > Could there be a race condition somewhere? The idea and the code looks
> > reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
> > patch and concurrent partition creation, I've got the following error on 
> > ATTACH
> > PARTITION:
> >
> > ERROR:  42P17: replica index does not exist in partition "test373"
> > LOCATION:  MatchReplicaIdentity, tablecmds.c:15018
> >
> > This error seems unstable, other time I've got a deadlock. And I don't 
> > observe
> > this behaviour on the master.
>
> Can you share your reproducer?

Sure, I was running concurrently the attached script, that creates a chain of
nested partitions test1,test2,..., and in a separate session:

alter table test1 replica identity full;

Checking this time, I've got from the script:

ERROR:  40P01: deadlock detected
DETAIL:  Process 10547 waits for AccessShareLock on relation 30449
of database
29898; blocked by process 9714.
Process 9714 waits for AccessExclusiveLock on relation 30454 of
database 29898;
blocked by process 10547.
HINT:  See server log for query details.
LOCATION:  DeadLockReport, deadlock.c:1140
Time: 1001.917 ms (00:01.002)
set -ex

psql -c "create table test1(id int primary key) partition by range(id);";

for idx in $(seq 2 1000);
do
psql -c "create table test$idx(id int primary key) partition by range(id);"
psql -c "alter table test$((idx - 1)) attach partition test$idx for values from ($idx) to (10);"
done;


Re: propagating replica identity to partitions

2019-02-15 Thread Alvaro Herrera
On 2019-Feb-07, Dmitry Dolgov wrote:

> Could there be a race condition somewhere? The idea and the code looks
> reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
> patch and concurrent partition creation, I've got the following error on 
> ATTACH
> PARTITION:
> 
> ERROR:  42P17: replica index does not exist in partition "test373"
> LOCATION:  MatchReplicaIdentity, tablecmds.c:15018
> 
> This error seems unstable, other time I've got a deadlock. And I don't observe
> this behaviour on the master.

Can you share your reproducer?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-09 Thread Alvaro Herrera
On 2019-Feb-07, Dmitry Dolgov wrote:

> Could there be a race condition somewhere? The idea and the code looks
> reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
> patch and concurrent partition creation, I've got the following error on 
> ATTACH
> PARTITION:
> 
> ERROR:  42P17: replica index does not exist in partition "test373"
> LOCATION:  MatchReplicaIdentity, tablecmds.c:15018
> 
> This error seems unstable, other time I've got a deadlock. And I don't observe
> this behaviour on the master.

Clearly there is a problem somewhere.  I'll investigate.  Thanks for
testing.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: propagating replica identity to partitions

2019-02-07 Thread Dmitry Dolgov
> On Tue, Feb 5, 2019 at 12:54 AM Alvaro Herrera  
> wrote:
>
> Actually, index-based replica identities failed in pg_dump: we first
> create the index ONLY on the partitioned table (marking it as invalid),
> so when we immediately do the ALTER TABLE/REPLICA IDENTITY, it rejects
> the invalid index.  If we change the code to allow invalid indexes on
> partitioned tables to become replica identities, we hit the problem that
> the index didn't exist when processing the partition list.  In order to
> fix that, I added a flag so that partitions are allowed not to have the
> index, in hopes that the missing index are created in subsequent
> commands; those indexes should become valid & identity afterwards.
>
> There's a small emerging problem, which is that if you have an invalid
> index marked as replica identity, you cannot create any more partitions;
> the reason is that we want to propagate the replica identity to the
> partition, but the index is not there (since invalid indexes are not
> propagated).  I don't think this case is worth supporting; it can be
> fixed but requires some work[1].
>
> New patch attached.

Could there be a race condition somewhere? The idea and the code looks
reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
patch and concurrent partition creation, I've got the following error on ATTACH
PARTITION:

ERROR:  42P17: replica index does not exist in partition "test373"
LOCATION:  MatchReplicaIdentity, tablecmds.c:15018

This error seems unstable, other time I've got a deadlock. And I don't observe
this behaviour on the master.



Re: propagating replica identity to partitions

2019-02-04 Thread Alvaro Herrera
Actually, index-based replica identities failed in pg_dump: we first
create the index ONLY on the partitioned table (marking it as invalid),
so when we immediately do the ALTER TABLE/REPLICA IDENTITY, it rejects
the invalid index.  If we change the code to allow invalid indexes on
partitioned tables to become replica identities, we hit the problem that
the index didn't exist when processing the partition list.  In order to
fix that, I added a flag so that partitions are allowed not to have the
index, in hopes that the missing index are created in subsequent
commands; those indexes should become valid & identity afterwards.

There's a small emerging problem, which is that if you have an invalid
index marked as replica identity, you cannot create any more partitions;
the reason is that we want to propagate the replica identity to the
partition, but the index is not there (since invalid indexes are not
propagated).  I don't think this case is worth supporting; it can be
fixed but requires some work[1].

New patch attached.

[1] In DefineRelation, we obtain the list of indexes to clone by
RelationGetIndexList, which ignores invalid indexes.  In order to
implement this we'd need to include invalid indexes in that list
(possibly by just not using RelationGetIndexList.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v3 1/2] index_get_partition

---
 src/backend/catalog/partition.c  | 35 +++
 src/backend/commands/tablecmds.c | 40 +++-
 src/include/catalog/partition.h  |  1 +
 3 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
 }
 
 /*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+	List	   *idxlist = RelationGetIndexList(partition);
+	ListCell   *l;
+
+	foreach(l, idxlist)
+	{
+		Oid			partIdx = lfirst_oid(l);
+		HeapTuple	tup;
+		Form_pg_class classForm;
+		bool		ispartition;
+
+		tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+		if (!tup)
+			elog(ERROR, "cache lookup failed for relation %u", partIdx);
+		classForm = (Form_pg_class) GETSTRUCT(tup);
+		ispartition = classForm->relispartition;
+		ReleaseSysCache(tup);
+		if (!ispartition)
+			continue;
+		if (get_partition_parent(lfirst_oid(l)) == indexId)
+		{
+			list_free(idxlist);
+			return partIdx;
+		}
+	}
+
+	return InvalidOid;
+}
+
+/*
  * map_partition_varattnos - maps varattno of any Vars in expr from the
  * attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
  * may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 static void
 refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
 {
-	Relation	pg_inherits;
-	ScanKeyData key;
-	HeapTuple	tuple;
-	SysScanDesc scan;
+	Oid			existingIdx;
 
-	pg_inherits = table_open(InheritsRelationId, AccessShareLock);
-	ScanKeyInit(, Anum_pg_inherits_inhparent,
-BTEqualStrategyNumber, F_OIDEQ,
-ObjectIdGetDatum(RelationGetRelid(parentIdx)));
-	scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
-			  NULL, 1, );
-	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
-	{
-		Form_pg_inherits inhForm;
-		Oid			tab;
-
-		inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
-		tab = IndexGetRelation(inhForm->inhrelid, false);
-		if (tab == RelationGetRelid(partitionTbl))
-			ereport(ERROR,
-	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-	 errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
-			RelationGetRelationName(partIdx),
-			RelationGetRelationName(parentIdx)),
-	 errdetail("Another index is already attached for partition \"%s\".",
-			   RelationGetRelationName(partitionTbl;
-	}
-
-	systable_endscan(scan);
-	table_close(pg_inherits, AccessShareLock);
+	existingIdx = index_get_partition(partitionTbl,
+	  RelationGetRelid(parentIdx));
+	if (OidIsValid(existingIdx))
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+		

Re: propagating replica identity to partitions

2019-02-04 Thread Alvaro Herrera
On 2019-Feb-04, Alvaro Herrera wrote:

> I'll now look more carefully at the cases involving indexes; thus far I
> was looking at the ones using FULL.  Those seem to work as intended.

Yeah, that didn't work so well -- it was trying to propagate the command
verbatim to each partition, and obviously the index names did not match.
So this subcommand has to reimplement the recursion internally, as in
the attached.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v2 1/2] index_get_partition

---
 src/backend/catalog/partition.c  | 35 +++
 src/backend/commands/tablecmds.c | 40 +++-
 src/include/catalog/partition.h  |  1 +
 3 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
 }
 
 /*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+	List	   *idxlist = RelationGetIndexList(partition);
+	ListCell   *l;
+
+	foreach(l, idxlist)
+	{
+		Oid			partIdx = lfirst_oid(l);
+		HeapTuple	tup;
+		Form_pg_class classForm;
+		bool		ispartition;
+
+		tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+		if (!tup)
+			elog(ERROR, "cache lookup failed for relation %u", partIdx);
+		classForm = (Form_pg_class) GETSTRUCT(tup);
+		ispartition = classForm->relispartition;
+		ReleaseSysCache(tup);
+		if (!ispartition)
+			continue;
+		if (get_partition_parent(lfirst_oid(l)) == indexId)
+		{
+			list_free(idxlist);
+			return partIdx;
+		}
+	}
+
+	return InvalidOid;
+}
+
+/*
  * map_partition_varattnos - maps varattno of any Vars in expr from the
  * attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
  * may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 static void
 refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
 {
-	Relation	pg_inherits;
-	ScanKeyData key;
-	HeapTuple	tuple;
-	SysScanDesc scan;
+	Oid			existingIdx;
 
-	pg_inherits = table_open(InheritsRelationId, AccessShareLock);
-	ScanKeyInit(, Anum_pg_inherits_inhparent,
-BTEqualStrategyNumber, F_OIDEQ,
-ObjectIdGetDatum(RelationGetRelid(parentIdx)));
-	scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
-			  NULL, 1, );
-	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
-	{
-		Form_pg_inherits inhForm;
-		Oid			tab;
-
-		inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
-		tab = IndexGetRelation(inhForm->inhrelid, false);
-		if (tab == RelationGetRelid(partitionTbl))
-			ereport(ERROR,
-	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-	 errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
-			RelationGetRelationName(partIdx),
-			RelationGetRelationName(parentIdx)),
-	 errdetail("Another index is already attached for partition \"%s\".",
-			   RelationGetRelationName(partitionTbl;
-	}
-
-	systable_endscan(scan);
-	table_close(pg_inherits, AccessShareLock);
+	existingIdx = index_get_partition(partitionTbl,
+	  RelationGetRelid(parentIdx));
+	if (OidIsValid(existingIdx))
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+		RelationGetRelationName(partIdx),
+		RelationGetRelationName(parentIdx)),
+ errdetail("Another index is already attached for partition \"%s\".",
+		   RelationGetRelationName(partitionTbl;
 }
 
 /*
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 5685d2fd57..7f0b198bcb 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -35,6 +35,7 @@ typedef struct PartitionDescData
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 		Relation to_rel, Relation from_rel,
 		bool *found_whole_row);
-- 
2.11.0

>From 688b801b8799ecd9827ee203bfb690536ac84ff4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 4 Feb